fix: skip unnecessary API key retrieval when Datadog extension is present by iizukanao · Pull Request #680 · DataDog/datadog-lambda-js
What does this PR do?
This PR refactors the API key retrieval logic in MetricsListener.onStartInvocation() to skip the getAPIKey() call when the Datadog Lambda Extension is running. The API key is now only fetched when the extension is not present, avoiding unnecessary Secrets Manager API calls.
Motivation
This change addresses an issue where Lambda functions using provisioned concurrency with the Datadog Lambda Extension experience "Signature expired" errors in logs when using Secrets Manager for API key storage.
Root Cause:
- When
DD_API_KEY_SECRET_ARNis configured and the Datadog Lambda Extension is running, the code was still callinggetAPIKey()during initialization - Since
getAPIKey()wasn't being awaited, the Lambda instance could pause during the 3-minute period between initialization and when requests start arriving (a characteristic of provisioned concurrency) - When requests finally arrived, the Secrets Manager client would attempt to use expired credentials, resulting in
InvalidSignatureException: Signature expirederrors being logged
Environment where this occurs:
- Using
DD_API_KEY_SECRET_ARN(Secrets Manager) - Datadog Lambda Extension layer (version 87)
- Provisioned concurrency enabled
- Warm-up calls to the datadog-wrapped handler during initialization
Testing Guidelines
- Local testing: All existing tests pass
- Production validation: Confirmed that Lambda functions using this configuration now work without Signature expired errors
- Integration tests: Requesting Datadog team to run integration tests to ensure no regressions
Additional Notes
This change also highlights a broader architectural concern: async operations like getAPIKey() should generally be awaited to prevent Lambda instances from pausing unexpectedly. While this specific fix addresses the immediate issue by avoiding unnecessary API calls when the extension is running, the underlying concern about unawaited async operations remains.
Types of Changes
- Bug fix
- New feature
- Breaking change
- Misc (docs, refactoring, dependency upgrade, etc.)
Check all that apply
- This PR's description is comprehensive
- This PR contains breaking changes that are documented in the description
- This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
- This PR impacts documentation, and it has been updated (or a ticket has been logged)
- This PR's changes are covered by the automated tests
- This PR collects user input/sensitive content into Datadog
- This PR passes the integration tests (ask a Datadog member to run the tests)