Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] FlightSQL Stateless Prepared Statements handling of Any protobuf messages #34

Open
matthewmturner opened this issue May 6, 2024 · 17 comments
Labels
Type: bug Something isn't working

Comments

@matthewmturner
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

We just upgraded our Go FlightSQL client to point to master so that we can use the recently added stateless prepared statements (apache/arrow#41428 and apache/arrow#40311). However, we have had issues while trying to use this when getting data from our Rust FlightSQL server. Specifically, we have been unable to deserialize the DoPutPreparedStatementResult that the server is returning.

We realized that the Rust implementation is returning Any protobuf messages (which we also verified in the protobuf messages being sent across the wire) but the go FlightSQL client is not deserializing using anypb to decode this message. Given that the prepared statement handle is meant to represent an opaque handle it seemed appropriate for it to use Any so I believe the Go side should be updated to use this - but let me know if I am mistaken.

I am also raising this to the Rust community to confirm that returning Any messages was intentional.

Component(s)

Go

@matthewmturner matthewmturner added the Type: bug Something isn't working label May 6, 2024
@zeroshade
Copy link
Member

@matthewmturner I just want to clarify something, the response from DoPut is a PutResult message that contains a single bytes field named app_metadata. In the case of the DoPut for DoPutPreparedStatementResult we were deserializing the app_metadata directly into a DoPutPreparedStatementResult, which contains the handle as a bytes member. Are you expecting the app_metadata of PutResult to be a an Any protobuf in order to retrieve the DoPutPreparedStatementResult? Or are you expecting the handle member to be an Any?

@lidavidm Since previously DoPutPreparedStatement didn't return anything until we added the DoPutPreparedStatementResult for updating the handle, there wasn't a convention for what should be returned. Looking at the results from the other messages I see the following general convention:

  • Responses from requests that go through the underlying DoAction (BeginSavepointResult, BeginTransactionResult, CancelQueryResult) are assumed to be serialized in an Any
  • responses from DoPut such as DoPutUpdateResult, are not assumed to be serialized as an Any inside of the app_metadata field of PutResult.

That said, it does appear that the Go implementation is outnumbered here as the C++ implementation does assume an Any wrapper around the DoPutPreparedStatementResult, but it just seems odd to me as it breaks the existing convention that we had, leaving it to be kind of arbitrary and not documented anywhere.

@matthewmturner
Copy link
Author

@zeroshade appreciate the quick reply.

As for my expectations, im just expecting there to be consistency across the different language implementations. I thought that the spec would define how this is supposed to look on the wire.

Im personally more aligned with the Any wrapper around the DoPutPreparedStatementResult though which I believe would make it easier to deserialize and add logic on the client side if needed.

Also FWIW on the rust side it looks like everything within do_put is wrapped in Any

@kou kou changed the title FlightSQL Stateless Prepared Statements handling of Any protobuf messages [Go] FlightSQL Stateless Prepared Statements handling of Any protobuf messages May 6, 2024
@lidavidm
Copy link
Member

lidavidm commented May 6, 2024

Can we get Rust in compliance? This is rather concerning. If we need to expand the integration tests then let's do so. Or we should tackle the old issue about using a pure gRPC+Protobuf client without Arrow libraries to do some conformance testing.

The idea of Flight SQL is that we shouldn't be adding extra client logic. So I would be against the Any wrapper. (It's also probably redundant on the request side when used in DoAction but that ship has sailed.)

@lidavidm
Copy link
Member

lidavidm commented May 6, 2024

+this is more reason to disentangle Flight-as-an-Arrow-transport from Flight-as-a-forced-service-definition since being able to define a proper gRPC service for Flight SQL would make these mistakes harder/impossible and would make testing, implementation, and telemetry much easier

@matthewmturner
Copy link
Author

@lidavidm thank you for the feedback. I will raise the issue on the Rust side.

@matthewmturner
Copy link
Author

@lidavidm Also, can you clarify what compliance means in this case? I would have thought compliance refers to following spec but it seems were discussing convention here. And to confirm does this mean the C++ implementation will also be updated to no longer use Any?

@lidavidm
Copy link
Member

lidavidm commented May 7, 2024

Right, the spec is insufficient here because of Flight RPC (if this were a gRPC service definition there would be no issue). So we should fix the spec too and add better conformance testing.

I would like the spec here to follow the existing convention and not use Any, as the result type should be a fixed quantity in this case. (Really, nothing should have used Any in the first place, but it's too late for that.) I'm open to being convinced otherwise, though

@matthewmturner
Copy link
Author

@lidavidm makes sense and definitely agree on the conformance testing. my only concern is that there may be some language specific conventions that could be conflicting. separate from the issues in these repos, and in particular if there will be a spec change, this is probably worth an email thread to gather more opinions. i can work on that tomorrow.

@erratic-pattern
Copy link
Contributor

The rationale behind the Any wrapper is that it is consistent with other command-specific responses in the arrow-rs implementation. Note the as_any method calls on each of the specific command types here:

https://github.com/apache/arrow-rs/blob/3d3ddb2108502854da98654ada85364d5627ef21/arrow-flight/src/sql/server.rs#L708-L742

Is this behavior inconsistent with the other arrow clients?

I think it would be confusing to handle this result type differently from the others.

@erratic-pattern
Copy link
Contributor

@lidavidm makes sense and definitely agree on the conformance testing. my only concern is that there may be some language specific conventions that could be conflicting. separate from the issues in these repos, and in particular if there will be a spec change, this is probably worth an email thread to gather more opinions. i can work on that tomorrow.

Yes, if there are language convention differences in how the app_metadata is encoded across all of the PutResult message types then that would suggest a larger problem that needs to be addressed. That would be surprising considering that no divergent behavior has been reported until now.

@lidavidm
Copy link
Member

I don't think there's been enough independent implementations, really, and I don't think there's too much cross-testing (e.g. ADBC gets tested against the Go SQLite sample and the OSS edition of Dremio, but we don't have a test for InfluxDB, IIRC last I looked Flight SQL was only in the cloud product?)

@matthewmturner
Copy link
Author

@lidavidm @erratic-pattern thank you both for the feedback.

I reviewed some more and agree with the proposal that the Rust implementation should be updated to not use Any. Let me know if you think I am mistaken in my analysis.

@lidavidm
Copy link
Member

Sounds good to me, thank you.

@zeroshade is there any work going on to both clarify the spec and test it more thoroughly?

@zeroshade
Copy link
Member

Not currently on my end. But i can add it to my list of stuff and make sure that we get around to it

@erratic-pattern
Copy link
Contributor

I have a PR for arrow-rs apache/arrow-rs#5817 I have not tested yet how it behaves with other client/server implementations.

@erratic-pattern
Copy link
Contributor

Is there anything left to do here? I think with closed all of the implementations should be consistent now. That being said, having an open issue for cross-language testing would be good. Maybe that should be raised as a separate issue?

@lidavidm
Copy link
Member

A separate issue for cross-language testing would be good

@assignUser assignUser transferred this issue from apache/arrow Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants