Comparing v90...v91 · DataDog/datadog-lambda-extension

Commits on Dec 1, 2025

Configuration menu

Browse the repository at this point in the history

Commits on Dec 2, 2025

Configuration menu

Browse the repository at this point in the history

Commits on Dec 3, 2025

[SVLS-8054] add integration testing (#946 )

## Overview
Setup integration tests for the lambda extension. These integration
tests will run on every PR.

### Details
This PR includes:
* CDK stacks for deploying lambda integration tests. This is for the
lambda, and any related resources, we want to test against.
* Integration tests, setup with Jest. These invoke lambda functions,
wait, then get Datadog telemetry data to test/verify against.
* Gitlab Integration Test Step (info below).
* `README.md` for how to run tests locally.

Note:
* For simplicity, this is setup to just test against the ARM variant
(not AMD). This also doesn't include FIPS or AppSec builds. I think this
should be a reasonable starting point for our integration tests and we
can evaluate adding additional configuration support as needed.

### Gitlab Integration Test Step
The integration tests step in Gitlab will:
1. Publish the lambda extension.
2. Deploy CDK stacks, using the newly published lambda extension.
3. Run a test suite.
4. Destroy the CDK stacks.
5. Delete the lambda extension.

### Executing the integration tests
The integration tests will automatically run on every PR. Developers can
also run the integration tests locally by running `npm run test`. Full
information is included in `README.md`.

### Example Integration Tests
I added a 2 basic tests, one for node and one for python. These lambda
function logs 'Hello World' and are setup with the extension and tracer
library. The integration test gets the logs and traces from Datadog. It
confirms that we have a log with the message 'Hello World!'. It also
confirms we have spans with names `aws.lambda.cold_start`,
`aws.lambda.load` and `aws.lambda`. Note that this isn't actually
working correct for python for `aws.lambda.load` and
`aws.lambda.cold_start`. Those spans are created, but with a different
traceId so they aren't getting linked to `aws.lambda`. I will follow up
and investigate.

I plan on having a follow up PR with other runtimes.

## Testing
This PR triggered the integration tests, can see the [corresponding
gitlab
pipeline](https://gitlab.ddbuild.io/DataDog/datadog-lambda-extension/-/pipelines/84401218)
with the newly added step 'integration-tests'. (Or see the
'dd-gitlab/integration-test' in the checks for this PR)

The results from the integration test can be obtained by going to
[integration
step](https://gitlab.ddbuild.io/DataDog/datadog-lambda-extension/-/jobs/1262527100)
and downloading the artifacts. Screenshot attached below.

Configuration menu

Browse the repository at this point in the history

Commits on Dec 4, 2025

fix(config): support colons in tag values (URLs, etc.) (#953 )

https://datadoghq.atlassian.net/browse/SVLS-8095

## Overview
Tag parsing previously used split(':') which broke values containing colons like URLs (git.repository_url:https://...). Changed to usesplitn(2, ':') to split only on the first colon, preserving the rest as the value.

Changes:
 - Add parse_key_value_tag() helper to centralize parsing logic
 - Refactor deserialize_key_value_pairs to use helper
 - Refactor deserialize_key_value_pair_array_to_hashmap to use helper
 - Add comprehensive test coverage for URL values and edge cases

## Testing 
unit test and expect e2e tests to pass

Co-authored-by: tianning.li <tianning.li@datadoghq.com>

Configuration menu

Browse the repository at this point in the history

docs: Add Lambda Managed Instance mode documentation (#951 )

docs: Add Lambda Managed Instance mode documentation

https://datadoghq.atlassian.net/browse/SVLS-8083

## Overview
Add comprehensive documentation for Lambda Managed Instance support
(v90+):
- Overview of Managed Instance mode and how it differs from standard
Lambda
  - Automatic detection and optimization behavior
- Background continuous flushing architecture with zero per-invocation
overhead
- Key differences comparison table (invocation model, flushing, use
cases)
  - Getting started guide for users

  Also clarifies that custom continuous flush intervals are respected in
  Managed Instance mode (not completely ignored as previously stated).

## Testing 
n/a

Configuration menu

Browse the repository at this point in the history

Commits on Dec 8, 2025

Configuration menu

Browse the repository at this point in the history
Configuration menu

Browse the repository at this point in the history
Configuration menu

Browse the repository at this point in the history

Commits on Dec 9, 2025

add integration tests for node and java (#958 )

## Overview
* Adding integration tests for Java and Dotnet. These are similar to the
existing Node/Python integration in that they just test very basic
functionality - we get logs/traces from the lambda function. These tests
are meant to be our starting point and serve as example setup for other
integration tests.
* Fixed how we are filtering logs to use
`@lambda.request_id:{requestId}`. This makes it use log attributes
instead of checking the actual log message.
* Updated stack cleanup step to just execute CLI command instead of CDK
command. There was a slight issue when cleaning up the Java/Dotnet
stacks due to their code assets, so the CLI command was easier.
* Added tag the tag `extension_integration_test: true` to all of the
stacks to make it easier to cleanup stacks if it gets missed (deployed
locally and forgot to clean up, pipeline cancelled before cleanup step,
etc.) A follow up item is to create a lambda function to periodically
run and clean up all old stacks with this tag.

## Testing 
* Integ tests for this PR passed.
* Checked AWS account and confirmed that there are no stacks with prefix
`integ-61910f24`

Configuration menu

Browse the repository at this point in the history

chore: Add a timer to avoid repeated debug logs (#954 )

## Problem
When the Lambda runtime spins down, the extension may enter a loop
waiting for unfinished work, printing up to 100,000s identical lines of
log:
> LOGS_AGENT | No more events to process but still have senders,
continuing to drain...

For example, in one of my tests, this line was printed 31602 times
within 1.28 seconds.

<img width="1097" height="344" alt="image"
src="https://github.com/user-attachments/assets/b3aff30c-596f-4837-a081-fbe165ba6254"
/>

This:
1. slightly complicates debugging for our engineers
2. adds costs and confusion for customers who turn on
`DD_LOG_LEVEL=debug` to debug the extension



## This PR
Add a timer and print this line at most once every 100ms, so it will be
printed at most 20 times within the 2-second spindown time.

## Testing
No testing for now. Should be straightforward. Will see if logs are
reduced in future debugging.

Configuration menu

Browse the repository at this point in the history

Commits on Dec 10, 2025

APPSEC-60188: gracefully accept null in APIGW response (#960 )

I strongly suspect the .NET Lambda SDK (from Amazon) produces `null`
values instead of omitting fields, which appears to be accepted by API
Gateway but is presently rejected by our parsing logic. This addresses
this problem and adds a new test case.

JJ-Change-Id: vprmkv
ZD: 2375557
Jira: APPSEC-60188

Configuration menu

Browse the repository at this point in the history

Configuration menu

Browse the repository at this point in the history

fix build layer script usages (#931 )

## Overview

more accurate commenting on using the build_bottlecap_layer script

Co-authored-by: olivier.ndjikenzia <olivier.ndjikenzia@datadoghq.com>

Configuration menu

Browse the repository at this point in the history

Commits on Dec 15, 2025

Configuration menu

Browse the repository at this point in the history
Configuration menu

Browse the repository at this point in the history

[SVLS-7934] feat: Support TLS certificate for trace/stats flusher (#961 )

## Problem
A customer reported that their Lambda is behind a proxy, and the
Rust-based extension can't send traces to Datadog via the proxy, while
the previous go-based extension worked.

## This PR
Supports the env var `DD_TLS_CERT_FILE`: The path to a file of
concatenated CA certificates in PEM format.
Example: `DD_TLS_CERT_FILE=/opt/ca-cert.pem`, so the when the extension
flushes traces/stats to Datadog, the HTTP client created can load and
use this cert, and connect the proxy properly.

## Testing
### Steps
1. Create a Lambda in a VPC with an NGINX proxy.
2. Add a layer to the Lambda, which includes the CA certificate
`ca-cert.pem`
3. Set env vars:
    - `DD_TLS_CERT_FILE=/opt/ca-cert.pem`
- `DD_PROXY_HTTPS=http://10.0.0.30:3128`, where `10.0.0.30` is the
private IP of the proxy EC2 instance
    - `DD_LOG_LEVEL=debug`
4. Update routing rules of security groups so the Lambda can reach
`http://10.0.0.30:3128`
5. Invoke the Lambda
### Result
**Before**
Trace flush failed with error logs:
> DD_EXTENSION | ERROR | Max retries exceeded, returning request error
error=Network error: client error (Connect) attempts=1
DD_EXTENSION | ERROR | TRACES | Request failed: No requests sent

**After**
Trace flush is successful:
> DD_EXTENSION | DEBUG | TRACES | Flushing 1 traces
DD_EXTENSION | DEBUG | TRACES | Added root certificate from
/opt/ca-cert.pem
DD_EXTENSION | DEBUG | TRACES | Proxy connector created with proxy:
Some("http://10.0.0.30:3128")
DD_EXTENSION | DEBUG | Sending with retry
url=https://trace.agent.datadoghq.com/api/v0.2/traces payload_size=1120
max_retries=1
DD_EXTENSION | DEBUG | Received response status=202 Accepted attempt=1
DD_EXTENSION | DEBUG | Request succeeded status=202 Accepted attempts=1
DD_EXTENSION | DEBUG | TRACES | Flushing took 1609 ms

## Notes
This fix only covers trace flusher and stats flusher, which use
`ServerlessTraceFlusher::get_http_client()` to create the HTTP client.
It doesn't cover logs flusher and proxy flusher, which use a different
function (http.rs:get_client()) to create the HTTP client. However, logs
flushing was successful in my tests, even if no certificate was added.
We can come back to logs/proxy flusher if someone reports an error.

Configuration menu

Browse the repository at this point in the history

chore: Upgrade libdatadog (#964 )

## Overview
The crate `datadog-trace-obfuscation` has been renamed as
`libdd-trace-obfuscation`. This PR updates this dependency.

## Testing

Configuration menu

Browse the repository at this point in the history

Commits on Dec 17, 2025

Configuration menu

Browse the repository at this point in the history