feat: Add the ability to request a schema from a statement #1514

paleolimbot · 2024-02-05T18:53:49Z

There are some situations (e.g., #1513) where the mapping of a database type to an Arrow type is not canonical. SQLite is an example of an end-member where all mappings of a database result are approximate (and not necessarily stable between queries).

When I rewrote the typing part of the PostgreSQL driver, I intentionally separated the "guess Arrow type from Postgres type" and "convert Postgres data to Arrow data" components. Given an Arrow type, it's reasonably straightforward to write the conversion from a Postgres type. The hard (and imprecise) part is the guessing.

Instead of providing a possibly ever-accumulating pile of options along the lines of "adbc.postgresql.statement.numeric_as_double" = "true", I wonder if we could add AdbcStatementRequestSchema(struct AdbcStatement*, struct ArrowSchema*). Often the query author knows this information (or is using a SQL generation tool that already knows what column types to expect). In more dynamic wrappers, one could inspect AdbcStatementExecuteSchema() and look for specific types. This model fits nicely with how the Python __arrow_c_stream__(requested_schema=xxxx) protocol is parameterized as well.

I'm not sure whether the request should be best-effort or error-if-cannot-be-satisfied (or whether the caller should be able to choose). But without the ability to pass an ArrowSchema*, it's very difficult to work around this: you could provide an IPC-serialized schema to AdbcStatementSetOptionBytes().

The text was updated successfully, but these errors were encountered:

lidavidm · 2024-02-05T19:04:34Z

IPC-serialized schema is an option, but that feels rather gross...the other way would be to have a fake "option" that expects you to Bind() an (empty) schema after setting it (which is also gross because of how stateful/procedural it is, but at least doesn't bounce through IPC)

lidavidm · 2024-02-05T19:04:48Z

Adding an explicit function would be best if we're going to expand things

CurtHagenlocher · 2024-02-05T19:11:50Z

Other database client specifications typically have some provision for letting the caller say "give me the value in column 1 as a " -- in ODBC, this is via the binding mechanism and both JDBC and ADO.NET let a caller say "getString" (and it's up to the driver to perform a conversion or fail).

paleolimbot added this to the ADBC API Specification Wishlist milestone Feb 5, 2024

lupko mentioned this issue Feb 6, 2024

feat(c/driver/postgresql): optionally return NUMERIC columns as double #1513

Open

paleolimbot mentioned this issue Aug 7, 2024

adbc_driver_postgres flatten multi dimensional array in Postgres #2063

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add the ability to request a schema from a statement #1514

feat: Add the ability to request a schema from a statement #1514

paleolimbot commented Feb 5, 2024

lidavidm commented Feb 5, 2024

lidavidm commented Feb 5, 2024

CurtHagenlocher commented Feb 5, 2024

feat: Add the ability to request a schema from a statement #1514

feat: Add the ability to request a schema from a statement #1514

Comments

paleolimbot commented Feb 5, 2024

lidavidm commented Feb 5, 2024

lidavidm commented Feb 5, 2024

CurtHagenlocher commented Feb 5, 2024