fix(warehouse): prevent race condition during upload job creation (#5076) by deepshekhardas · Pull Request #6727 · rudderlabs/rudder-server

deepshekhardas added 6 commits

February 28, 2026 21:22
…use, jobsdb, and config

- fix(archiver): replace panic calls with graceful error handling (rudderlabs#6630)
- fix(warehouse): add dedicated cleanup context timeout for object storage (rudderlabs#6349)
- fix(warehouse): handle empty staging files gracefully instead of hard error (rudderlabs#5076)
- feat(clickhouse): change default partition strategy from daily to monthly (rudderlabs#5079)
- fix(jobsdb): parameterize customVal in SQL queries to prevent injection (rudderlabs#6284)
- fix(migration): add pgcrypto extension for PostgreSQL 11 compatibility (rudderlabs#6336)
- feat(config): add YAML file support for declarative workspace config (rudderlabs#2045)

Signed-off-by: contributor <contributor@rudderstack.com>
…rk.go

Replace 7 panic(err) calls with structured error logging and graceful
error handling to prevent container crashes on transient failures.

- handle.go: 5 panics replaced (active partitions, job status updates,
  parameter unmarshalling, status commits, event order state changes)
- worker.go: 1 panic replaced (event order state change in worker)
- network.go: 1 panic replaced (JSON marshal failure returns error response)

This improves router reliability by ensuring transient errors don't
crash the entire service.
- batchrouter/handle_lifecycle.go: 8 panics replaced in crashRecover
  and isolation strategy setup with structured error logging
- jobsdb/migration.go: 1 panic replaced in migrateDSLoop

These changes prevent container crashes during crash recovery and
database migration operations.
…tability

- batchrouter/handle.go: 8 panics replaced (active partitions, DB reads,
  file operations, journal marking, job status updates)
- warehouse/router/identities.go: 13 panics replaced (identity table
  queries, table creation, index management, schema marshalling,
  warehouse manager creation)

All panic calls replaced with structured error logging and graceful
error handling to prevent container crashes on transient failures.
…tability

- processor/: 8 panics replaced in worker loops and status updates
  with structured logging and graceful flow control (continue/return).
- router/batchrouter/asyncdestinationmanager/bing-ads/: 9 panics
  replaced in audience and offline-conversions utility helpers.
- warehouse/router/state.go: 1 panic replaced with empty return for
  invalid state transitions.

This batch of fixes targets the processing pipeline and specialized
async managers to prevent wide-scale container crashes on transient errors.
Remove skipMaintenanceError conditional panic in refreshDSListLoop,
matching the same pattern already fixed in migration.go. Now all DS
refresh errors are logged without crashing the container.

deepshekhardas pushed a commit to deepshekhardas/rudder-server that referenced this pull request

Mar 15, 2026