Conversation
|
@claude review |
There was a problem hiding this comment.
LGTM — clean bug fix with proper idempotency guard and test coverage.
Extended reasoning...
Overview
The PR touches three files: the OTel context propagation utility (propagation.py), the LangChain callback handler (CallbackHandler.py), and a new test file. The changes extract a shared _detach_context_token_safely helper, add an idempotent _exit_propagation_context method, and fix on_chain_error which was missing a finally block for cleanup.
Security Risks
None. The changes are purely about error-handling and context cleanup in async teardown paths. The use of _RUNTIME_CONTEXT.detach directly (bypassing the OTel public API's error logger) is intentional and scoped narrowly to swallow known-safe context-mismatch errors in async environments.
Level of Scrutiny
Low — this is a targeted bug fix. The _exit_propagation_context method sets the manager field to None before calling __exit__, making double-calls idempotent. The double-call pattern in on_chain_end (try block + finally block) is safe by design. The on_chain_error fix (adding a finally block with _reset) closes a real cleanup gap.
Other Factors
Two focused unit tests are added: one verifying the error-swallowing behavior and one verifying propagation context exit on chain errors. The refactoring reduces code duplication without changing behavior. No CODEOWNERS concerns identified.
…t-error-using-async-with
|
I was actually looking at this code, where on_chain_end has a _reset call but not on_chain_error. I was investigating whether this was causing a memory leak in the dict. @hassiebp could you comment on whether the absence of the _reset call in on_chain_error could also have caused a memory leak? I am experiencing a very gradual memory increase in my application. I thought it was due to not using a global Langchain Handler but the issue persists even after that. I am still on major version 3 of the sdk, can i just port these changes to fix the issue? |
|
This could be possible. This is now fixed in v4, could you please try including the reset in on chain error and report back whether this fixed the memory buildup for you? |
Disclaimer: Experimental PR review
Greptile Summary
This PR fixes two related issues in the Langchain callback handler: (1) silences noisy OpenTelemetry error logs that fire when a context token is detached in a different async execution context than where it was attached, and (2) ensures the propagation context manager is always exited (and state is reset) even when a chain terminates via
on_chain_errorat the root level — something that was previously missing.Key changes:
_detach_context_token_safely()inpropagation.py— calls_RUNTIME_CONTEXT.detach()directly to bypass OTel's built-in error logging while still performing the detachment; consolidates duplicated try/except boilerplate fromCallbackHandler.py._exit_propagation_context()method toLangchainCallbackHandler— idempotent teardown that nulls_propagation_context_managerbefore calling__exit__, making double-calls in the try + finally pattern safe.on_chain_errorto call_exit_propagation_context()and_reset()in afinallyblock — previously neither was called on root-chain errors, leaving the handler in a dirty state.Confidence Score: 5/5
Safe to merge — fixes a real cleanup gap and a noisy log issue with no regressions introduced.
All three changed files are correct. The idempotent
_exit_propagation_contextpattern (nulling the field before calling__exit__) makes the try + finally double-call safe. The_detach_context_token_safelyhelper is well-documented, and the two new tests directly cover both repaired scenarios. No P0 or P1 issues found.No files require special attention.
Important Files Changed
_detach_context_token_safely()helper that calls_RUNTIME_CONTEXT.detach()directly instead of the publicotel_context_api.detach()to bypass OpenTelemetry's built-in error logging on context mismatch;_RUNTIME_CONTEXTimport added at module top level._exit_propagation_context()method; fixeson_chain_errorto properly exit the propagation context and call_reset()in its finally block (previously missing).on_chain_errorfires at the root chain level.Sequence Diagram
sequenceDiagram participant LC as LangChain participant CB as LangchainCallbackHandler participant PC as PropagationContext participant OTel as OpenTelemetry Runtime LC->>CB: on_chain_start(run_id, parent_run_id=None) CB->>PC: _propagation_context_manager.__enter__() PC->>OTel: otel_context_api.attach(context) OTel-->>PC: token CB->>OTel: context.attach(span_ctx) → context_token alt Success path LC->>CB: on_chain_end(run_id, parent_run_id=None) CB->>CB: _detach_observation(run_id) CB->>OTel: _RUNTIME_CONTEXT.detach(context_token) [safe] CB->>CB: span.update() + span.end() CB->>PC: _exit_propagation_context() [try block] PC->>OTel: _RUNTIME_CONTEXT.detach(token) [safe, via __exit__] CB->>CB: finally: _exit_propagation_context() [no-op] + _reset() else Error path (fixed by this PR) LC->>CB: on_chain_error(error, run_id, parent_run_id=None) CB->>CB: _detach_observation(run_id) CB->>OTel: _RUNTIME_CONTEXT.detach(context_token) [safe] CB->>CB: observation.update() + _exit_propagation_context() PC->>OTel: _RUNTIME_CONTEXT.detach(token) [safe, via __exit__] CB->>CB: observation.end() CB->>CB: finally: _exit_propagation_context() [no-op] + _reset() endReviews (1): Last reviewed commit: "fix(langchain): exit propagation context..." | Re-trigger Greptile