You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some situations (e.g., #1513) where the mapping of a database type to an Arrow type is not canonical. SQLite is an example of an end-member where all mappings of a database result are approximate (and not necessarily stable between queries).
When I rewrote the typing part of the PostgreSQL driver, I intentionally separated the "guess Arrow type from Postgres type" and "convert Postgres data to Arrow data" components. Given an Arrow type, it's reasonably straightforward to write the conversion from a Postgres type. The hard (and imprecise) part is the guessing.
Instead of providing a possibly ever-accumulating pile of options along the lines of "adbc.postgresql.statement.numeric_as_double" = "true", I wonder if we could add AdbcStatementRequestSchema(struct AdbcStatement*, struct ArrowSchema*). Often the query author knows this information (or is using a SQL generation tool that already knows what column types to expect). In more dynamic wrappers, one could inspect AdbcStatementExecuteSchema() and look for specific types. This model fits nicely with how the Python __arrow_c_stream__(requested_schema=xxxx) protocol is parameterized as well.
I'm not sure whether the request should be best-effort or error-if-cannot-be-satisfied (or whether the caller should be able to choose). But without the ability to pass an ArrowSchema*, it's very difficult to work around this: you could provide an IPC-serialized schema to AdbcStatementSetOptionBytes().
The text was updated successfully, but these errors were encountered:
IPC-serialized schema is an option, but that feels rather gross...the other way would be to have a fake "option" that expects you to Bind() an (empty) schema after setting it (which is also gross because of how stateful/procedural it is, but at least doesn't bounce through IPC)
Other database client specifications typically have some provision for letting the caller say "give me the value in column 1 as a " -- in ODBC, this is via the binding mechanism and both JDBC and ADO.NET let a caller say "getString" (and it's up to the driver to perform a conversion or fail).
There are some situations (e.g., #1513) where the mapping of a database type to an Arrow type is not canonical. SQLite is an example of an end-member where all mappings of a database result are approximate (and not necessarily stable between queries).
When I rewrote the typing part of the PostgreSQL driver, I intentionally separated the "guess Arrow type from Postgres type" and "convert Postgres data to Arrow data" components. Given an Arrow type, it's reasonably straightforward to write the conversion from a Postgres type. The hard (and imprecise) part is the guessing.
Instead of providing a possibly ever-accumulating pile of options along the lines of
"adbc.postgresql.statement.numeric_as_double" = "true"
, I wonder if we could addAdbcStatementRequestSchema(struct AdbcStatement*, struct ArrowSchema*)
. Often the query author knows this information (or is using a SQL generation tool that already knows what column types to expect). In more dynamic wrappers, one could inspectAdbcStatementExecuteSchema()
and look for specific types. This model fits nicely with how the Python__arrow_c_stream__(requested_schema=xxxx)
protocol is parameterized as well.I'm not sure whether the request should be best-effort or error-if-cannot-be-satisfied (or whether the caller should be able to choose). But without the ability to pass an
ArrowSchema*
, it's very difficult to work around this: you could provide an IPC-serialized schema toAdbcStatementSetOptionBytes()
.The text was updated successfully, but these errors were encountered: