You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(router): dynamic sizing of worker job buffers (#6463)
🔒 Scanned for secrets using gitleaks 8.28.0
# Description
This update introduces **dynamic buffer sizing** for job queues within
router worker channels. Previously, the buffer capacity was fixed using
the `noOfJobsPerChannel` configuration. With this change, the effective
buffer capacity can now grow or shrink (up to `maxNoOfJobsPerChannel`,
default: 10,000) based on the following calculation strategies:
### **1. Standard** (default)
Buffer capacity is set to:
`max(noOfJobsPerChannel, noOfJobsToBatchInAWorker)`
> `noOfJobsToBatchInAWorker` remains a reloadable value.
### **2. Experimental** (enabled through
`enableExperimentalBufferSizeCalculator`)
The buffer capacity is determined using the following metrics:
1. **Work loop throughput** — average number of jobs processed per
second by the worker's loop (`workLoopThroughput`).
2. **Query batch size per worker** — calculated as `jobQueryBatchSize /
noOfWorkers` — note that `jobQueryBatchSize` can change dynamically if
pickup throttling is enabled.
3. **Configured batch size** — `noOfJobsToBatchInAWorker`.
The **maximum** of the three values above is selected and then
**multiplied by a scaling factor of 2.0** to compute the dynamic buffer
size.
**Special Case:**
If `workLoopThroughput < 1`, the buffer size is forced to `1` to
intentionally start small and apply backpressure in scenarios of low
processing throughput.
---
<img width="1742" height="753" alt="router 101-Router throughput drawio
(2)"
src="https://github.com/user-attachments/assets/499cb02a-96e0-416c-a09f-d84dfea087fc"
/>
## Additional Changes
- During router shutdown, workers no longer wait to drain their job
buffers. They exit early instead. Any in-progress jobs are marked as
failed by the router on the next startup.
- New worker statistics are now recorded:
1. buffer capacity: `router_worker_buffer_capacity `
2. buffer size: `router_worker_buffer_size `
3. average work-loop throughput: `router_worker_work_loop_throughput `
## Linear Ticket
resolves PIPE-2500
## Security
- [x] The code changed/added as part of this pull request won't create
any security issues with how the software is being used.
skipRtAbortAlertForTransformation config.ValueLoader[bool] // represents if event delivery(via transformerProxy) should be alerted via router-aborted-count alert def
72
-
skipRtAbortAlertForDelivery config.ValueLoader[bool] // represents if transformation(router or batch) should be alerted via router-aborted-count alert def
skipRtAbortAlertForTransformation config.ValueLoader[bool] // represents if event delivery(via transformerProxy) should be alerted via router-aborted-count alert def
72
+
skipRtAbortAlertForDelivery config.ValueLoader[bool] // represents if transformation(router or batch) should be alerted via router-aborted-count alert def
0 commit comments