Implementing drivers in python #2292

tokoko · 2024-10-30T12:44:32Z

What would you like help with?

I suppose this is already possible by duck typing classes to look like the ones in adbc_driver_manager, but I'm curious what's the general attitude towards implementing new drivers in python. A couple of valid use cases that come to mind are:

cases when there are already available python packages for the backend that wrap other languages and can output arrow. For example, I eventually opted to go with rust in case of datafusion, but I could have instead implemented it in python using datafusion-python. I get that it's not an optimal solution as the end result would only be accessible with python, but it could have accelerated prototype development.
wrappers around existing drivers that augment underlying drivers with some additional functionality. For example, a python driver that wraps a sqlite driver and adds substrait capabilities by translating substrait to sql before invoking actual commands or a driver that's powered by something like sqlglot and does dialect translations in the python layer.

I'm wondering if it might be a good idea to add a dummy python driver implementation to encourage such use cases.

The text was updated successfully, but these errors were encountered:

lidavidm · 2024-10-31T00:41:26Z

Python wrappers is fine; SQlite already adds a couple of extra methods IIRC.

I'm not sure implementing drivers in Python makes any sense. At that point what you're actually doing is just implementing DB-API, no?

paleolimbot · 2024-10-31T01:52:14Z

For what it's worth I think it's something that is perfectly valid to enable (although there is a long list of things ahead of it for me personally). Kirill and I chatted briefly about this in R since it would enable existing DBI drivers to more easily implement an ADBC-native interface (allowing us to migrate end-user usage to ADBC). In R we are perhaps more actively trying to move on from DBI than Python users are trying to move on from dbapi.

The ability to instantly prototype a driver and test it shouldn't be undersold, either (although we could make a project with the boilerplate in Go, C++, and Rust with a few Python tests that might accomplish something similar).

tokoko · 2024-10-31T06:42:36Z

I'm not sure implementing drivers in Python makes any sense. At that point what you're actually doing is just implementing DB-API, no?

sure, I guess that is what I mean, but to be fair it's not just DB-API, right? It's a heavily adbc-flavored DB-API at best. Most of the features why people would look this way is adbc/arrow specific: fetch_arrow, get_objects, partitions, substrait.

The ability to instantly prototype a driver and test it shouldn't be undersold, either (although we could make a project with the boilerplate in Go, C++, and Rust with a few Python tests that might accomplish something similar).

I know this might not be the best comparison, but I'm sort of thinking of python drivers as analogous to the newly added Python DataSource API in pyspark. You could argue that prototyping in java/scala can be just as easy, but it's all about familiarity at the end of the day, right? For pyspark users, python API probably means less hurdles for a prototype. To extend the example to this discussion, if some python system/library is directly using adbc (meaning DB-API with adbc extensions) as a pluggable source, it might be easier to implement some unusual cases directly in python, most likely in the same codebase w/o any additional build steps.

lidavidm · 2024-11-05T00:20:53Z

I suppose anyone is free to duck-type themselves as an ADBC driver, I'm mostly just reluctant to expand the scope to include a formal Python API specification. But maybe we should try to intentionally compete with DB-API and/or formalize some of the extensions that we (and others) make to the API.

tokoko added the Type: question Usage question label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing drivers in python #2292

Implementing drivers in python #2292

tokoko commented Oct 30, 2024

lidavidm commented Oct 31, 2024

paleolimbot commented Oct 31, 2024 •

edited

Loading

tokoko commented Oct 31, 2024

lidavidm commented Nov 5, 2024

Implementing drivers in python #2292

Implementing drivers in python #2292

Comments

tokoko commented Oct 30, 2024

What would you like help with?

lidavidm commented Oct 31, 2024

paleolimbot commented Oct 31, 2024 • edited Loading

tokoko commented Oct 31, 2024

lidavidm commented Nov 5, 2024

paleolimbot commented Oct 31, 2024 •

edited

Loading