Skip to content

fix(langchain): exit propagation context gracefully#1588

Merged
hassiebp merged 3 commits intomainfrom
hassieb/lfe-6414-bug-failed-to-detach-context-error-using-async-with
Mar 30, 2026
Merged

fix(langchain): exit propagation context gracefully#1588
hassiebp merged 3 commits intomainfrom
hassieb/lfe-6414-bug-failed-to-detach-context-error-using-async-with

Conversation

@hassiebp
Copy link
Copy Markdown
Contributor

@hassiebp hassiebp commented Mar 30, 2026

Disclaimer: Experimental PR review

Greptile Summary

This PR fixes two related issues in the Langchain callback handler: (1) silences noisy OpenTelemetry error logs that fire when a context token is detached in a different async execution context than where it was attached, and (2) ensures the propagation context manager is always exited (and state is reset) even when a chain terminates via on_chain_error at the root level — something that was previously missing.

Key changes:

  • Adds _detach_context_token_safely() in propagation.py — calls _RUNTIME_CONTEXT.detach() directly to bypass OTel's built-in error logging while still performing the detachment; consolidates duplicated try/except boilerplate from CallbackHandler.py.
  • Adds _exit_propagation_context() method to LangchainCallbackHandler — idempotent teardown that nulls _propagation_context_manager before calling __exit__, making double-calls in the try + finally pattern safe.
  • Fixes on_chain_error to call _exit_propagation_context() and _reset() in a finally block — previously neither was called on root-chain errors, leaving the handler in a dirty state.
  • Two new unit tests covering the async context-mismatch suppression and the root-chain error teardown path.

Confidence Score: 5/5

Safe to merge — fixes a real cleanup gap and a noisy log issue with no regressions introduced.

All three changed files are correct. The idempotent _exit_propagation_context pattern (nulling the field before calling __exit__) makes the try + finally double-call safe. The _detach_context_token_safely helper is well-documented, and the two new tests directly cover both repaired scenarios. No P0 or P1 issues found.

No files require special attention.

Important Files Changed

Filename Overview
langfuse/_client/propagation.py Introduces _detach_context_token_safely() helper that calls _RUNTIME_CONTEXT.detach() directly instead of the public otel_context_api.detach() to bypass OpenTelemetry's built-in error logging on context mismatch; _RUNTIME_CONTEXT import added at module top level.
langfuse/langchain/CallbackHandler.py Replaces inline safe-detach logic with the shared helper; adds idempotent _exit_propagation_context() method; fixes on_chain_error to properly exit the propagation context and call _reset() in its finally block (previously missing).
tests/test_langchain_callback_handler.py New test file with two targeted tests: one verifying no OTel error log is emitted on context mismatch, and one verifying the propagation context is fully torn down when on_chain_error fires at the root chain level.

Sequence Diagram

sequenceDiagram
    participant LC as LangChain
    participant CB as LangchainCallbackHandler
    participant PC as PropagationContext
    participant OTel as OpenTelemetry Runtime

    LC->>CB: on_chain_start(run_id, parent_run_id=None)
    CB->>PC: _propagation_context_manager.__enter__()
    PC->>OTel: otel_context_api.attach(context)
    OTel-->>PC: token
    CB->>OTel: context.attach(span_ctx) → context_token

    alt Success path
        LC->>CB: on_chain_end(run_id, parent_run_id=None)
        CB->>CB: _detach_observation(run_id)
        CB->>OTel: _RUNTIME_CONTEXT.detach(context_token) [safe]
        CB->>CB: span.update() + span.end()
        CB->>PC: _exit_propagation_context() [try block]
        PC->>OTel: _RUNTIME_CONTEXT.detach(token) [safe, via __exit__]
        CB->>CB: finally: _exit_propagation_context() [no-op] + _reset()
    else Error path (fixed by this PR)
        LC->>CB: on_chain_error(error, run_id, parent_run_id=None)
        CB->>CB: _detach_observation(run_id)
        CB->>OTel: _RUNTIME_CONTEXT.detach(context_token) [safe]
        CB->>CB: observation.update() + _exit_propagation_context()
        PC->>OTel: _RUNTIME_CONTEXT.detach(token) [safe, via __exit__]
        CB->>CB: observation.end()
        CB->>CB: finally: _exit_propagation_context() [no-op] + _reset()
    end
Loading

Reviews (1): Last reviewed commit: "fix(langchain): exit propagation context..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown

@claude review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean bug fix with proper idempotency guard and test coverage.

Extended reasoning...

Overview

The PR touches three files: the OTel context propagation utility (propagation.py), the LangChain callback handler (CallbackHandler.py), and a new test file. The changes extract a shared _detach_context_token_safely helper, add an idempotent _exit_propagation_context method, and fix on_chain_error which was missing a finally block for cleanup.

Security Risks

None. The changes are purely about error-handling and context cleanup in async teardown paths. The use of _RUNTIME_CONTEXT.detach directly (bypassing the OTel public API's error logger) is intentional and scoped narrowly to swallow known-safe context-mismatch errors in async environments.

Level of Scrutiny

Low — this is a targeted bug fix. The _exit_propagation_context method sets the manager field to None before calling __exit__, making double-calls idempotent. The double-call pattern in on_chain_end (try block + finally block) is safe by design. The on_chain_error fix (adding a finally block with _reset) closes a real cleanup gap.

Other Factors

Two focused unit tests are added: one verifying the error-swallowing behavior and one verifying propagation context exit on chain errors. The refactoring reduces code duplication without changing behavior. No CODEOWNERS concerns identified.

@hassiebp hassiebp enabled auto-merge (squash) March 30, 2026 14:03
@hassiebp hassiebp merged commit 549faca into main Mar 30, 2026
13 checks passed
@hassiebp hassiebp deleted the hassieb/lfe-6414-bug-failed-to-detach-context-error-using-async-with branch March 30, 2026 14:14
@lssatvik-afb
Copy link
Copy Markdown

I was actually looking at this code, where on_chain_end has a _reset call but not on_chain_error. I was investigating whether this was causing a memory leak in the dict. @hassiebp could you comment on whether the absence of the _reset call in on_chain_error could also have caused a memory leak? I am experiencing a very gradual memory increase in my application. I thought it was due to not using a global Langchain Handler but the issue persists even after that.

I am still on major version 3 of the sdk, can i just port these changes to fix the issue?

Copy link
Copy Markdown
Contributor Author

This could be possible. This is now fixed in v4, could you please try including the reset in on chain error and report back whether this fixed the memory buildup for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: "Failed to detach context" error using async with langchain CallbackHandler

2 participants