PostgreSQL Timestamp with time zone range type not supported #1550

ttomasz · 2024-02-15T21:39:15Z

What happened?

On arrow table fetch I got error saying that type 3910 ( tstzrange ) is not implemented

How can we reproduce the bug?

Create a table with timestamp range and try to read it using cursor.fetch_arrow_table()

Environment/Setup

PostgreSQL driver version 0.9.0
Python 3.9
pyarrow 15.0.0
Windows
pip

lidavidm · 2024-02-15T22:05:35Z

Arrow doesn't have a native range type. Though it should've fallen back to giving you the raw bytes.

What would be a useful way to encode the data?

CurtHagenlocher · 2024-02-16T14:46:52Z

I imagine you could model it as a structure of two values and two flags. More generally, PostgreSQL seems to have a long tail of exotic data types and it's probably helpful to have generic guidance for those.

lidavidm · 2024-02-16T15:26:09Z

Yes, well, the question is if it might be worth trying to define canonical extension types for each of these, or if we're going to just do something ad-hoc in this driver specifically

ttomasz · 2024-02-16T23:12:11Z

What would be a useful way to encode the data?

Probably a struct with lower and upper bound timestamps and boolean flags indicating inclusivity like Curt mentioned. It could also be dumped in text representation that would be castable back to range type if loaded back to postgres.

I don't know if the latter could be made automatically if type is not recognized.

lidavidm · 2024-02-17T00:12:49Z

The extension type would make it easier for us to round trip data, though, if that's expected. (Also, it's the kind of thing that's harder to change down the line.)

We can't figure out the string representation automatically. We get whatever bytes Postgres gives us and have to interpret it.

lidavidm · 2024-02-17T00:15:04Z

This has been discussed before if someone would like to advance the discussion.

apache/arrow#28389
https://lists.apache.org/thread/ybrm5m27xc163nmjzn9zwsvbq6cfc0x0

lidavidm · 2024-02-17T00:16:51Z

That said, I'm sympathetic to Curt's point that there's going to be quite a few of these, and indeed I'm not sure every such type would want/need a dedicated extension type. CC @paleolimbot @zeroshade for opinions too.

I could see an argument for having a non-canonical extension type (that we just define as part of the Postgres driver), but working with extension types may be a little awkward downstream?

paleolimbot · 2024-02-17T02:48:14Z

Though it should've fallen back to giving you the raw bytes.

The only other time I've experienced this is when I created a type in the Postgres driver using the same underlying "database" object. We cache the type table in the database object and we don't currently provide a way to invalidate that. It could also be that something about the logic that queries the type table misses the range type, or there is an error when inserting it into the "type resolver". The place where that gets built is here:

arrow-adbc/c/driver/postgresql/database.cc

Line 133 in 37026a9

AdbcStatusCode PostgresDatabase::RebuildTypeResolver(struct AdbcError* error) {

I imagine you could model it as a structure of two values and two flags.

I agree that a struct would be the most intuitive. I would have to check how this comes through in COPY output, but I imagine you would need four columns.

the question is if it might be worth trying to define canonical extension types for each of these, or if we're going to just do something ad-hoc in this driver specifically

You could probably do it with one extension type (maybe arrow.adbc.postgres), whose serialization held the type OID (which is basically all we have at the point we start constructing a result). Field metadata would do about the same thing except with an extension type you can customize how those arrays are converted to Python objects.

What we could do about this in the short-term is:

Make sure there's an option to return the type OID and/or name as field metadata for all fields (so that consumers can implement workarounds if they need more granularity on Postgres type)
Return range types as structs (and make sure they make it into the type resolver)

lidavidm · 2024-02-17T18:06:42Z

Hmm, I like the idea of a single generic extension type that just flags "hey, we don't know how to interpret this, but here are the raw bytes". I also like that over just field metadata since the field metadata is a bit of an ad-hoc convention vs. extension types which are actually meant for this sort of thing (I think).

Make sure there's an option to return the type OID and/or name as field metadata for all fields (so that consumers can implement workarounds if they need more granularity on Postgres type)

I wouldn't be opposed to just always doing this (this came up elsewhere too)

Return range types as structs (and make sure they make it into the type resolver)

I'm OK with just encoding as struct, then. Possibly also still wrapped under said generic extension type?

ttomasz added the Type: bug Something isn't working label Feb 15, 2024

lidavidm added this to the ADBC Libraries 0.11.0 milestone Feb 15, 2024

lidavidm mentioned this issue Mar 19, 2024

refactor(c/driver/sqlite): port to driver base #1603

Merged

lidavidm modified the milestones: ADBC Libraries 0.11.0, ADBC Libraries 1.0.0 Mar 27, 2024

lidavidm removed this from the ADBC Libraries 1.0.0 milestone Apr 23, 2024

lidavidm added this to the ADBC Libraries 13 milestone May 10, 2024

lidavidm modified the milestones: ADBC Libraries 13, ADBC Libraries 14 Jun 30, 2024

lidavidm modified the milestones: ADBC Libraries 14, ADBC Libraries 15 Aug 20, 2024

lidavidm modified the milestones: ADBC Libraries 15, ADBC Libraries 16 Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PostgreSQL Timestamp with time zone range type not supported #1550

PostgreSQL Timestamp with time zone range type not supported #1550

ttomasz commented Feb 15, 2024

lidavidm commented Feb 15, 2024

CurtHagenlocher commented Feb 16, 2024

lidavidm commented Feb 16, 2024

ttomasz commented Feb 16, 2024

lidavidm commented Feb 17, 2024

lidavidm commented Feb 17, 2024

lidavidm commented Feb 17, 2024

paleolimbot commented Feb 17, 2024

lidavidm commented Feb 17, 2024

PostgreSQL Timestamp with time zone range type not supported #1550

PostgreSQL Timestamp with time zone range type not supported #1550

Comments

ttomasz commented Feb 15, 2024

What happened?

How can we reproduce the bug?

Environment/Setup

lidavidm commented Feb 15, 2024

CurtHagenlocher commented Feb 16, 2024

lidavidm commented Feb 16, 2024

ttomasz commented Feb 16, 2024

lidavidm commented Feb 17, 2024

lidavidm commented Feb 17, 2024

lidavidm commented Feb 17, 2024

paleolimbot commented Feb 17, 2024

lidavidm commented Feb 17, 2024