Proposal for an additional instrumentation hook point by tinnou · Pull Request #4206 · graphql-java/graphql-java

CONTEXT

In our infrastructure, we like to capture resolver latency along with error presence on a single metric. This unlocks alerting based on resolver error rate & resolver latency. We also leverage centralized resolver exception handling (via DataFetchingExceptionHandler) and implementations often contains logic to transforms certain classes of exceptions as client or server GraphQL errors. For example a data fetcher might raise a ValidationException which will be caught by the ExceptionHandler and mapped as a client GraphQL error (bad request error type).

Unfortunately right now, the existing instrumentation hooks (e.g instrumentDataFetcher or beginFieldFetching/beginFieldFetch) are insufficient because it's impossible to observe the post-exception-handling DataFetcherResult. So in the example above, instrumentations would only have access to the ValidationException and not the graphql client error because exception handling has not happened yet. Emitting a metric that tracks the rate of client/server errors as well as latency becomes complicated without replicating the exception handling inside instrumentations.

WHAT IS THIS PR PROPOSING?

This PR exposes an additional method on default void onExceptionHandled(DataFetcherResult<Object> dataFetcherResult) on FieldFetchingInstrumentationContext to observe the final DataFetcherResult after exception handling. This is part of the beginFieldFetching instrumentation method callback.

Breaking Changes

  • This is an additive, API backwards-compatible extension. Existing instrumentation remains valid. Instrumentation implementations that want these new capabilities can implement the new interface.
  • the onCompleted method is now called after exception handling, which is a minor change in behavior from the current implementation.

I put this PR together as a discussion starting point (using the Cunningham's Law), there are definitive alternatives that I thought of and would love your input.

Alternatives considered:

  1. Extend the existing InstrumentationContext API with new callbacks:
    • Pros: single context type.
    • Cons: would modify the general-purpose context contract and could break or confuse clients; less explicit about field-fetching semantics.
    • Rejected to preserve separation of concerns and minimize API surface changes.
  2. Add separate callbacks on Instrumentation (new methods) rather than a context:
    • Pros: offers a purely additive solution
    • Cons: less flexible for per-field context state and harder to reuse existing InstrumentationContext lifecycle semantics. Adding extra hooks could make the execution feel crowded.