-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/adbc/driver/flightsql: long delay between createPreparedStatement and getFlightInfoPreparedStatement #2040
Comments
....hmm. 30 minutes is rather concerning. Would it be possible to get a thread dump (e.g. with gdb) if you notice that happening? There is a log level env var that can be set (ADBC_DRIVER_FLIGHTSQL_LOG_LEVEL=debug), but I guess the first question is if this is in any way reproducible, or something that was noticed after-the-fact with telemetry We should be totally fine with concurrency. There are some locks around internal state but they are meant to be extremely short. |
Unfortunately it's not reproducible, we noticed it after-the-fact when users reported queries hanging. I only saw this happen with 4 queries, all around the same time, 3 on the same pod and 1 on a separate pod. We might try setting |
@zeroshade or @joellubi any ideas here what could've happened? |
Or if there's any way to get Go to dump stack without killing the process? (Maybe that's another debug feature we should consider - optionally binding to SIGUSR or something) |
There's a few ways we could do so, though if this is being implemented via Python threads it'll be a bit difficult to track down which stack is the problematic one. Though we could also have a signal just start the pprof profiling to a known filename, and then receiving the signal again stops the profiling. Then it would be easy to pull up and read and check it out |
Well, this hasn't happened again and we've since noticed some memory issues. It's possible that this pod was having memory strain and that caused this slowness. I'm going to close it and report back if we see this again. |
Well, sadly we've seen continued issues with this now that we fixed the memory issues so I'm reopening. Any thoughts on how we might try to debug this? |
We're about to do a release. We could add a bit more logging and hope that can help narrow things down? |
That said, those are 2 separate calls in Python. So maybe the Python side could also be logged. Also if you don't strictly need a prepared statement that could be disabled as well. |
So specifically, you can explicitly call I think to avoid the prepared statement, you'd have to drop down to the raw API. We could add a config option or something to control this instead... |
ohhh interesting, I noticed the Unfortunately using regular Statements is not an option for us right now. Since neither ADBC nor JDBC use those requests, we have not implemented it. Side note/question: is there any plan on having (A|J)DBC use regular Statements instead of PreparedStatements? we did notice some recent attempted requests from a Go ADBC driver to the regular Statement flow on our server. |
The Python ADBC bindings tend to use the Prepared statement workflow rather than regular statements as a performance / safety scenario to ensure compatibility with DBAPI and parameter binding, while also allowing an easier workflow for users and reducing code duplication in the bindings. The native implementations (the actual ADBC spec, the Go implementation, C# impl, etc.) expose the lower level constructs which allow users to make the decision themselves whether or not they call |
I'll try to add an example.
I'll try to add examples here too. And/or think of a way to let you opt out of this in the DBAPI bindings |
Relates to apache#2040.
See #2097 |
What happened?
We have a Python API that uses the ADBC driver to execute queries against our Java Arrow Flight SQL server. The Python server uses FastAPI + Strawberry, and when we receive a request to execute a query, we spin up a background thread to handle the execution against the AFS server. Multiple threads on the same pod could be executing queries against the AFS server at any given moment.
We recently noticed some issues with hanging queries, and when looking at our DataDog traces, we notice that there is almost a 30 minute difference between the
createPreparedStatement
request and thegetFlightInfoPreparedStatement
request.My initial guess is that this could be related to having multiple requests at the same time through the ADBC driver, but I don't have enough context about how the bindings between Go and Python work.
Is there anything that jumps out at you? Is there anything we could do to help debug this? Here's pictures of the traces:
Stack Trace
No response
How can we reproduce the bug?
No response
Environment/Setup
Python 3.12
ADBC FlightSQL driver 1.0.0
ADBC driver manager 1.0.0
The text was updated successfully, but these errors were encountered: