feat: implement diagnostic channels for observability by logaretm · Pull Request #3195 · redis/node-redis
AI review requested due to automatic review settings
March 11, 2026 20:08Qard approved these changes Mar 11, 2026
Qard approved these changes Mar 11, 2026
Qard approved these changes Mar 11, 2026
This was referenced
Mar 13, 2026Use number | undefined everywhere instead of mixing with number?. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eliminates repetitive ternary pattern. TracingChannelContextMap maps channel names to their context types so getTracingChannel infers the correct type without explicit generics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tead metrics.ts no longer imports dc directly. All channel acquisition goes through the exported helpers which handle caching and undefined checks. dc remains private to tracing.ts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace traceCommand, traceBatch, traceConnect, traceConnectionWait with a single trace(CHANNELS.TRACE_*, fn, contextFactory) function. Channel name resolves the context type via TracingChannelContextMap. Also adds cache for getTracingChannel to avoid re-acquiring on every call, matching getChannel's caching pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace separate #subscriptions and #tracingChannels arrays with a single #unsubscribers list of cleanup closures. Simpler destroy(), no need to store channel/handler pairs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Inline the single remaining noop as an object literal in the default #instance. No noop classes or files remain. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove duplicate COMMAND_REPLY publish from sendCommand — typed commands already publish via _executeCommand. Also merge the two separate COMMAND_REPLY subscribers (pubsub out + streaming) into a single #subscribeCommandReply method. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the acquire timeout fires, rejectWait() is called so the TracingChannel error channel fires — APMs see the timeout as an error instead of an orphaned span that never closes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When OTel is not initialized or disabled, skip instrument registration and subscriber creation entirely. No noop instruments needed — if there are no subscribers, channels are never fired. Removes 139-line noop-meter.ts and all noop fallback branches from instrument creation helpers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- COMMAND_REPLY in _executeCommand now passes sanitizeArgs(parser.redisArgs) instead of raw args, preventing sensitive values from leaking to subscribers - Pool connection wait trace promise gets .catch(noop) to prevent unhandled rejection if a tracing subscriber throws Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When pool.destroy() is called with tasks still queued, call rejectWait() on each pending task so the TracingChannel error channel fires and APM subscribers close their spans cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Tracing test: check isOpen before destroying already-closed client - OTel E2E tests: rewrite 5 tests that called removed methods (resiliencyMetrics.recordClientErrors, streamMetrics.recordStreamLag) to publish via channels instead - Connection closed metric: split subscriber for basic (connection count) and advanced (close reason) so tests enabling only one group see correct metrics Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The original code always recorded wait time, even for 0ms waits when a client was available. The TracingChannel refactor only traced the "no client, must wait" path. Add trace for immediate availability so the metric is always emitted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…or event Remove redundant publish(CHANNELS.ERROR) from sendCommand — the TracingChannel already emits error events for command failures. Wire OTel resiliency subscriber to commandTC.error for command-level error counts; CHANNELS.ERROR now only carries cluster/internal errors. Extract subscribeTC() helper to reduce TracingChannel subscription boilerplate and #recordError() to share error metric recording logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous approach unconditionally dropped all MOVED/ASK errors in #recordError, including cluster-origin ones that indicate slot migration. Move the filter to each call site so client-origin redirections are skipped while cluster-origin ones are still recorded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion internal and origin are separate fields — an error can be non-internal but still originate from the cluster. Use ctx.origin === 'client' as Pavel suggested to correctly identify client-origin redirections. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Short-circuit with noop stubs when enabled is false, so metrics are not recorded even when meterProvider and enabledMetricGroups are provided. Update the disabled-metrics test to supply those options, verifying the guard works rather than passing vacuously. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Redis replies QUEUED for each command inside MULTI before EXEC runs, so per-command traces resolve as successful even when EXEC fails. Only the batch-level trace reflects the real outcome. Remove per-command tracing from _executeMulti and keep only the TRACE_BATCH span. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters