feat(jobsdb): improve migration compaction by atzoum · Pull Request #6844 · rudderlabs/rudder-server

@atzoum atzoum changed the title chore(jobsdb): keep only last dataset for writers instead of last two chore(jobsdb): keep only last dataset for writers; guard migration against oversized destination datasets

Apr 3, 2026

@atzoum atzoum changed the title chore(jobsdb): keep only last dataset for writers; guard migration against oversized destination datasets chore(jobsdb): improve migration compaction

Apr 3, 2026

@atzoum atzoum changed the title chore(jobsdb): improve migration compaction feat(jobsdb): improve migration compaction

Apr 3, 2026

ktgowtham

@atzoum atzoum marked this pull request as draft

April 7, 2026 12:56
Writers only write to the last dataset, so shrinking the datasetList
to only the last one is sufficient. Similarly, migration now exempts
only the last (currently-written) dataset instead of the last two.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

🔒 Scanned for secrets using gitleaks 8.30.1
…er than maxDSSize

With jobMinRowsLeftMigrateThreshold > 0.5, combining two needsPair datasets
can exceed maxDSSize. Two guards are added to getMigrationList:
- When accumulating additional datasets into an existing batch, break if adding
  the next one would push pendingJobsCount over maxDSSize.
- When combining a waiting dataset with its pair, discard both if their combined
  pending count exceeds maxDSSize, and continue probing for a smaller pair.

firstEligible is now derived from migrateFrom[0] at the end of the loop
instead of being set eagerly, ensuring it always reflects the actual first
dataset that will be migrated rather than a dataset from a discarded pair.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

🔒 Scanned for secrets using gitleaks 8.30.1