You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm looking for feedback from @esbenp before I dig into a PR for this.
Goal:
I'd like to adapt pdf-bot to be a scaleable pdf rendering microservice which can have resources added/removed on demand to handle workload fluctuations.
Because PG is a shared database, it' possible to scale the work load horizontally across many machines in parallel. To accomplish this, we would need to change the queue locking mechanism to be on a per-job basis, and adapt the generation commands (shift:all comes to mind) to support this.
There are a few concerns here:
This would require a database migration of some sort to support
Process crashes, unhandled errors, etc could result in jobs never being processed if implemented poorly
?
Purposed implementation:
Add a processing_started_at date column to the jobs table
Adapt getAllUnfinished to select jobs where they aren't completed and processing_started_at is greater than a given a configurable amount (30 sec default maybe)
Make isBusy calls return false always (maybe?)
Adapt cli scripts (shift, shift:all, etc) to handle the possibility of getting an empty array of jobs instead of relying on an isBusy call (maybe?)
Remove setIsBusy calls (maybe?)
Add changes to LowDb as well (maybe?)
Remove worker table
id
processing_started_at
completed_at
1
2018-01-08 17:31:17.825153
2018-01-08 17:31:48.925153
2
2018-01-08 17:31:17.825153
null
3
2018-01-08 17:31:47.925153
null
4
2018-01-08 17:31:48.925153
null
5
null
null
6
null
null
Given this sample data, jobs 2, 5, and 6 would be eligible for the next generation worker to start processing, while jobs 3 and 4 are assumed to be currently processing.
If this all sounds like too big of an overhaul, I'd be open to other suggestions. I'd also be willing to add the support to a new Redis database adapter instead as well.
The text was updated successfully, but these errors were encountered:
I'm looking for feedback from @esbenp before I dig into a PR for this.
Goal:
I'd like to adapt
pdf-bot
to be a scaleable pdf rendering microservice which can have resources added/removed on demand to handle workload fluctuations.Problem:
Because of pdf-bot's PostgreSQL database wide queue locking, only one machine can render pdf's for the given API endpoint at a time.
Because PG is a shared database, it' possible to scale the work load horizontally across many machines in parallel. To accomplish this, we would need to change the queue locking mechanism to be on a per-job basis, and adapt the generation commands (
shift:all
comes to mind) to support this.There are a few concerns here:
Purposed implementation:
processing_started_at
date column to the jobs tablegetAllUnfinished
to select jobs where they aren't completed andprocessing_started_at
is greater than a given a configurable amount (30 sec default maybe)isBusy
calls return false always (maybe?)shift
,shift:all
, etc) to handle the possibility of getting an empty array of jobs instead of relying on anisBusy
call (maybe?)setIsBusy
calls (maybe?)LowDb
as well (maybe?)worker
tableGiven this sample data, jobs 2, 5, and 6 would be eligible for the next generation worker to start processing, while jobs 3 and 4 are assumed to be currently processing.
If this all sounds like too big of an overhaul, I'd be open to other suggestions. I'd also be willing to add the support to a new Redis database adapter instead as well.
The text was updated successfully, but these errors were encountered: