Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application stops releasing/reusing connections when the pool size is exhausted and ends up in a deadlock #213

Open
bmagrys opened this issue Jul 5, 2024 · 0 comments

Comments

@bmagrys
Copy link

bmagrys commented Jul 5, 2024

Bug Report

Versions

  • Driver:
org.postgresql:postgresql:jar:42.6.2
io.r2dbc:r2dbc-pool:jar:1.0.1.RELEASE
  • Database: PostgreSQL 14.9
  • Java: 17
  • OS: Mac OS + Linux

Current Behaviour

I have a spring boot service which uses r2dbc with pooling. I have multiple endpoints which are calling transactional code. At some point some operations were required to not be part of current transaction, but rather be a separate operation to avoid being being connected to current transaction and to be not part of rollback of previous transaction. If the load is higher and starts using full throughput capabilities on netty we see that it's getting stuck forever randomly at some database operation. Based on the enabled DEBUG logs for r2dbc we see that getting or suspending the connection is stuck quite randomly. Without any timeout app stops accepting new requests. With a timeout it's only less worse because it can throw an exception (although it shouldn't IMO, but rather take more time than usual). The last log is usually Suspending current transaction, creating new transaction with name [...], but not only.

Steps to reproduce

The easiest scenario is as follows. I am making more requests (or just transactional service invocations) simultaneously than size of the reactor/netty pool. In my case (due to number of cores) it was 10 by default, which is also the default size of r2dbc pool. I made 20 requests/invocations of service as an example, but even 11 should be enough based on my experience. If the transactional service invokes another transactional service with propagation REQUIRES_NEW then it's stuck under suspending the transaction.
05-07-2024 10:32:04.717 [DefaultDispatcher-worker-7 @coroutine#30] DEBUG o.s.r.c.R2dbcTransactionManager.handleExistingTransaction:207 - Suspending current transaction, creating new transaction with name [com.bmagrys.r2dbc.locked.Demo2Service.test]

https://github.com/bmagrys/r2dbc-pool-issue-deadlock
Executing DemoApplicationTests takes forever.

Expected behaviour

Even if the pool is under high load and fully used it shouldn't hang the app forever. Even having timeouts shouldn't make transaction failed but rather just slower. Exhausting the pool that is not that small and equal to shared reactor pool size shouldn't make app unresponsive or break the transaction.

I tried exact same scenario, but on non-reactive stack with use of HikariCp and it's not a problem there. The same code executes just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant