Handle changes to `MutableSettings` and `ExporterSettings` without rebuilding by andrewlock · Pull Request #7724 · DataDog/dd-trace-dotnet

This was referenced

Oct 28, 2025

chatgpt-codex-connector[bot]

@andrewlock

@andrewlock

- Move statsd instance creation to separate factory
- Create a StatsdManager to handle automatic updating in response to setting changes
- Always create a statsd instance, as it's hard to know if we're _ever_ going to need one, and reduces some of the compexity

@andrewlock

@andrewlock

@andrewlock

@andrewlock

… reconfiguration is not allowed

@andrewlock

@andrewlock

…s though, and doesn't respond to changes

@andrewlock

@andrewlock

@andrewlock

This isn't necessary with the current design, and it causes issues today

@andrewlock

@andrewlock

@andrewlock

Make sure we can't dispose a stats consumer that's in use (as it will throw)
Rework to use a "lease" mechanism to track usages
Make passing in a statsmanager required
The statsd client does sync-over-async in the flush and dispose paths, which can lead to deadlocks and thread exhaustion.
To work around that, we push the dispose to happen on a thread-pool thread instead, in the background
… config changes (#7796)

## Summary of changes

A fix for #7724 to handle telemetry reporting in dynamic config "reset"
scenarios

## Reason for change

The system tests for #7724 were failing in some dynamic configuration
scenarios. Specifically, the tests were sending remote config _without_
any configuration values "i.e. 'reset to use defaults'" and were waiting
a telemetry update. However, we never sent it, because there was "no
telemetry to record".

Note that we _did_ correctly apply the new configuration, we just didn't
report the telemetry correctly, primarily due to limitations in the
telemetry protocol. This PR adds a fix for that, and will be merged into
#7724.

## Implementation details

The solution is to "remember" the telemetry from the default mutable
configuration values, _without_ any dynamic sources, and "replay" this
telemetry when we update telemetry. This feels kind of hacky, but it's
something I suspected we might need to do, and had been avoiding up to
this point because we do a "full reconfigure" anyway.

## Test coverage

Added a specific unit test that mimics the behaviour of the system-test
(i.e. an "empty" dynamic config response) and confirms the telemetry is
recorded as expected

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

- #7522
- #7525
- #7530
- #7532
- #7543
- #7544
- #7721
- #7722
- #7695
- #7723
- #7724
- #7796 👈

Unlike other PRs in the stack, I'll merge this directly into #7724 to
fix the tests there, just thought I'd keep this separate for easier
reviewing

tonyredondo

NachoEchevarria

@andrewlock andrewlock deleted the andrew/settings/5-remove-mutablesettings branch

November 26, 2025 18:21

andrewlock added a commit that referenced this pull request

Nov 27, 2025
## Summary of changes

Updates a couple of places where we're calling `Tracer.Instance` where
we don't need to

## Reason for change

In one of my other PRs I accidentally broke something that should _only_
have affected integration tests, but a bunch of unit tests broke. It
highlighted where they were using `Tracer.Instance` and setting the
global tracer for tests. In other places, it revealed that code that
tests that _looked_ independent of other tests really wasn't...

Calling `Tracer.Instance` potentially does a _lot_ of work, as it
initializes the tracer. It's also hard to follow if we're making static
calls out in places we don't need to. So just pass through the settings
we want instead.

## Implementation details

Instead of calling `Tracer.Instance.Settings`, use the value of `Tracer`
or `TracerSettings` that's already available wherever possible. It makes
the tests cleaner too.

## Test coverage

Same coverage, just a bit cleaner 

## Other details

Included as part of the config stack, just because I already refactored
some of this code, and can't be bothered to faff with merge conflicts:

https://datadoghq.atlassian.net/browse/LANGPLAT-819

- #7522
- #7525
- #7530
- #7532
- #7543
- #7544
- #7721
- #7722
- #7695
- #7723
- #7724
- #7744 👈

andrewlock added a commit that referenced this pull request

Dec 1, 2025
## Summary of changes

- Fix `null` reference exception in `AgentWriter` benchmark (introduced
in #7724)
- Add a (local only) benchmark that tests serialization without flushing
- use a custom tag object in the benchmarks

## Reason for change

The `AgentWriter` benchmark has been broken since #7724 was merged, as
it's passing a null ref to a non-null field. This fixes that issue.

Also, this adds a benchmark (which _doesn't_ run in CI currently) which
tests the serialization of spans _without_ flushing, so we can
distinguish the source of allocation more easily.

Finally, I made some changes to the spans used in the existing
`AgentWriter` benchmark. This is due to discrepancies I noticed in the
results from the benchmark vs other testing.

## Implementation details

- Pass a `StatsdManager` into the `AgentWriter` instead of `null`
- Create a "noop" `Api` which doesn't do anything, and use that in a
different `AgentWriter` instance, this effectively isolates the overhead
in the benchmark to being the serialization _only_, instead of including
"flush" related overhead (which _is_ included in the existing
benchmark). To avoid bloating our benchmarking runs, don't bother to run
this one in CI (it's just useful for inestigation).
- Pass a `SqlTags` object into the spans, with a couple of fields set.
- Set a tag _other_ than `env` on the spans. `env` is a "special"
trace-level tag, so it doesn't _really_ do what you think it does here
and IMO is less revealing.

## Test coverage

Tested locally - running the benchmarks _without_ the sql tags change:

| Method | Runtime | Mean | Error | StdDev | Median | Ratio | RatioSD |
Allocated | Alloc Ratio |
|---------------------------- |---------------------
|---------:|--------:|---------:|---------:|------:|--------:|----------:|------------:|
| WriteEnrichedTraces | .NET 6.0 | 391.0 us | 7.54 us | 9.54 us | 389.2
us | 0.79 | 0.02 | 105 B | 0.50 |
| WriteEnrichedTraces | .NET Framework 4.7.2 | 499.2 us | 3.95 us | 3.69
us | 499.0 us | 1.00 | 0.00 | 208 B | 1.00 |
| | | | | | | | | | |
| WriteAndFlushEnrichedTraces | .NET 6.0 | 391.9 us | 8.65 us | 24.82 us
| 384.9 us | 0.81 | 0.05 | 2697 B | 0.81 |
| WriteAndFlushEnrichedTraces | .NET Framework 4.7.2 | 511.3 us | 5.30
us | 4.43 us | 511.0 us | 1.00 | 0.00 | 3312 B | 1.00 |

I then made the `SqlTags` change, and ran again:

| Method | Runtime | Mean | Error | StdDev | Ratio | RatioSD | Gen0 |
Allocated | Alloc Ratio |
|---------------------------- |---------------------
|---------:|---------:|---------:|------:|--------:|--------:|----------:|------------:|
| WriteEnrichedTraces | .NET 6.0 | 488.9 us | 7.55 us | 11.29 us | 0.70
| 0.02 | - | 110 B | 0.001 |
| WriteEnrichedTraces | .NET Framework 4.7.2 | 703.3 us | 5.56 us | 4.93
us | 1.00 | 0.00 | 17.5781 | 112537 B | 1.000 |
| | | | | | | | | | |
| WriteAndFlushEnrichedTraces | .NET 6.0 | 481.2 us | 9.39 us | 10.43 us
| 0.64 | 0.02 | - | 2701 B | 0.02 |
| WriteAndFlushEnrichedTraces | .NET Framework 4.7.2 | 725.2 us | 14.49
us | 27.21 us | 1.00 | 0.00 | 17.5781 | 115641 B | 1.00 |

Yikes, we have some rogue allocation there on .NET FX!

## Other details