feat: add Prometheus collector for DERP server expvar metrics by sreya · Pull Request #22583 · coder/coder

added 10 commits

March 3, 2026 22:09
Create a prometheus.Collector that bridges the tailscale derp.Server's
expvar-based stats to Prometheus metrics with namespace coder, subsystem
wsproxy_derp. Handles counters, gauges, labeled metrics (nested
metrics.Set for drop reasons, packet types, etc.), and the average
queue duration (converted from ms to seconds).

Register the collector in the wsproxy server after derpServer creation.
Add Prometheus metrics tracking active DERP websocket connections and
bytes relayed through the wsproxy:

- coder_wsproxy_derp_websocket_active_connections (gauge)
- coder_wsproxy_derp_websocket_bytes_total (counter, direction=read|write)

Implementation adds a DERPWebsocketMetrics hook struct and countingConn
wrapper in tailnet/, and a new WithWebsocketSupportAndMetrics function
that instruments the websocket connection lifecycle. The existing
WithWebsocketSupport function delegates to the new one with nil metrics.
…rs.NewExpvarCollector

Removes the hand-rolled enterprise/wsproxy/derpmetrics package and uses
the prometheus client library's NewExpvarCollector instead. This bridges
the same DERP server expvar stats to Prometheus with less code to maintain.

Metrics are now exposed as coder_wsproxy_derp{metric="<key>"} instead of
individual named metrics. Grafana dashboard queries updated accordingly.
- Rename expvar key from "wsproxy_derp" to "derp" to match coderd
- Rename sync.Once variable to expDERPOnce with clearer comment
- Move DERP metrics collector into enterprise/wsproxy/metrics.go
- Revert tailnet/derp.go changes (remove WithWebsocketSupportAndMetrics)
- Remove tailnet/derp_metrics.go (websocket byte counting was redundant
  with the DERP server expvar bytes_received/bytes_sent counters)
- Remove unused collectors import from wsproxy.go
Moves the DERP expvar-to-Prometheus collector to tailnet/ so it can be
shared between coderd and wsproxy. Registers it on both Prometheus
registries. Resolves the existing TODO in coderd/coderd.go.

Metric name is now coder_derp{metric="..."} for both coderd and wsproxy.
Adds --prometheus-enable --prometheus-address=127.0.0.1:2113 to the
local wsproxy started by develop.sh --use-proxy, so DERP metrics can
be verified during development.
…etrics

The generic collectors.NewExpvarCollector exported everything as untyped
metrics under a single name with a label, losing counter/gauge type info
and dropping nested metrics entirely.

Replace with a custom DERPExpvarCollector that:
- Properly types counters (bytes_received_total, packets_sent_total, etc.)
  and gauges (connections, clients_local, etc.)
- Iterates nested metrics.Set for labeled counters (packets_dropped by
  reason, packets_received by kind, tcp_rtt by bucket)
- Uses standard Prometheus naming (coder_derp_* prefix, _total suffix)
- Accepts *derp.Server directly instead of relying on global expvar state
Add TestWorkspaceProxyDERPMetrics to verify the DERPExpvarCollector is
registered during wsproxy startup, mirroring the existing TestDERPMetrics
in coderd.

Also fix expvar.Publish guards in both coderd and wsproxy to check
expvar.Get before publishing. The sync.Once per package was insufficient
when both coderd and wsproxy run in the same test process, as both
attempt to publish under the same "derp" key.

@sreya

Inline newDERPDesc wrapper to direct prometheus.NewDesc calls so the
metricsdocgen scanner can discover them via static AST analysis. Add
tailnet to the scanner's scanDirs list. Regenerate generated_metrics
and prometheus.md docs.

@sreya

Remove the expvar HTTP handler and the expvar.Publish call from wsproxy.
The DERP metrics are now exported via the Prometheus collector, making
the unauthenticated expvar endpoint unnecessary. coderd's /debug/expvar
remains (it's behind authenticated routes).

@sreya

@sreya

Rename all DERP Prometheus metrics from coder_derp_* to
coder_derp_server_* for clearer namespacing. Regenerate
generated_metrics and prometheus.md docs.

deansheather

@sreya

@sreya sreya deleted the jon/wsproxy-metrics branch

March 6, 2026 07:58