fix(start-proxy): explicitly fail when proxy retries are exhausted by iaadillatif · Pull Request #3531 · github/codeql-action

This PR hardens proxy startup reliability by converting a silent false-success scenario into an explicit failure.

Fixes #3530 issue

Previously, the startup loop could exhaust retry attempts without throwing a terminal error. In such cases, outputs could still be set and execution would continue even though the proxy process was not running.

Before:
Proxy started successfully
[process exited immediately]

After:
Failed to start proxy after 5 attempts (last exit code: 1)

This change ensures that proxy initialization fails fast if stabilization does not occur.


Root Cause

The previous control flow:

  • Retried startup attempts
  • Only threw on immediate spawn error
  • Did not explicitly fail after retry exhaustion
  • Could report success if no process remained alive

This allowed workflows to proceed with a non-functional proxy.


Changes

1. Extracted startup logic

Created src/start-proxy/launcher.ts:

  • Encapsulates startup behavior.
  • Allows dependency injection for testing.
  • Improves separation of concerns.

2. Explicit retry exhaustion handling

  • Tracks lastExitCode.
  • Verifies process remains alive after stabilization delay.
  • Throws Error("Failed to start proxy after 5 attempts (last exit code: X)") when no stable process remains.

3. Tests

Added launcher.test.ts covering:

  • Retry exhaustion with exit code diagnostics.
  • Spawn error propagation.
  • Successful stabilization and output configuration.

Backward Compatibility

  • No change to output names
  • No change to telemetry format
  • No change to wrapper success/failure reporting
  • No new inputs introduced

Behavioral impact is limited to converting silent false-positive startup into explicit failure.


Testing

Executed:

  • npm run lint
  • npm run build
  • npm run test

Added tests in:
src/start-proxy/launcher.test.ts

All new launcher tests pass on Node 24+.

Covering:

  • Exhausted retries
  • Spawn error propagation
  • Stable process success path

Why This Matters

This improves workflow reliability by:

  • Ensuring failures surface at the correct lifecycle stage
  • Reducing downstream noise
  • Providing clearer diagnostics

Fail-fast behavior improves user debugging experience and system robustness.