Skip to content

fix(streamable_http): handle ClientDisconnect during POST at WARNING level#4

Merged
wiggzz merged 2 commits intodbt-labs/patched-1.27.0from
wj/DI-3218-clientdisconnect-fix-v3
May 5, 2026
Merged

fix(streamable_http): handle ClientDisconnect during POST at WARNING level#4
wiggzz merged 2 commits intodbt-labs/patched-1.27.0from
wj/DI-3218-clientdisconnect-fix-v3

Conversation

@wiggzz
Copy link
Copy Markdown

@wiggzz wiggzz commented May 5, 2026

Problem

When a client disconnects mid-POST (network timeout, cancel, LB drop), the MCP SDK's _handle_post_request catches ClientDisconnect in its generic except Exception handler, which:

  1. Logs an ERROR ("Error handling POST request") to Datadog — this is pattern 1, 67% of all prod ERRORs in ai-codegen-api
  2. Synthesizes a 500 response and sends it to the already-closed socket
  3. Does not notify the inner session writer, so stateful session tasks can hang

Solution

Catch ClientDisconnect before the generic except Exception handler in _handle_post_request:

  • Log at WARNING — honest level: we aborted handling a request, but it wasn't our fault
  • Notify the session writer with ClientDisconnect() so inner session tasks unblock cleanly (stateful sessions)
  • Skip sending a response — the socket is closed; sending to it is just noise
  • Scope: POST only — Datadog confirms 100% of pattern 1 stacks originate from _handle_post_request

Tests

4 new unit tests in tests/server/streamable_http/test_client_disconnect_post.py:

  • test_client_disconnect_logs_warning_not_error — verifies WARNING, not ERROR
  • test_client_disconnect_does_not_send_response — verifies no ASGI sends to closed socket
  • test_client_disconnect_notifies_writer — verifies writer receives ClientDisconnect
  • test_client_disconnect_writer_suppresses_errors — verifies broken writer doesn't crash us

All existing streamable_http tests pass (pre-existing flaky test test_stateless_get_returns_405 excluded).

Upstream context

This patch combines the approach of two open upstream PRs (neither merged):

This is a tight ~10-line change taking the spirit of modelcontextprotocol#1647 + semantics of modelcontextprotocol#1947. Tracking issue: modelcontextprotocol/python-sdk#1648.

Follow-up

After this lands in the fork and is verified in prod via ai-codegen-api bump:

  • Open mirror PR upstream
  • Consider dropping ClientDisconnectHandlerMiddleware in ai-codegen-api (no longer load-bearing for POST path)

wiggzz added 2 commits May 5, 2026 11:21
…level

Catch ClientDisconnect in _handle_post_request so client disconnections
(network timeout, cancel, LB drop) log at WARNING instead of ERROR and do
not attempt to send a response to the closed socket.

This fixes pattern 1 (67% of all prod ERRORs in ai-codegen-api): when the
client disconnects mid-POST, the current except-Exception handler synthesizes
a 500 response to a closed socket and logs ERROR-level noise to Datadog.

Changes:
- Catch ClientDisconnect before the generic except-Exception handler
- Log at WARNING (honest level: we aborted but it wasn't our fault)
- Notify the session writer so inner session tasks unblock cleanly
- Suppress errors from writer.send() since the writer may already be closed
- Skip sending a response (the socket is gone)

Upstream context:
- modelcontextprotocol#1647 (POST-only scope, right approach)
- modelcontextprotocol#1947 (writer-notification semantics, skip-response)
- modelcontextprotocol#1648 (tracking issue, still open, no PRs merged)

This patch takes the spirit of modelcontextprotocol#1647 (POST-only scope) + the semantics of
modelcontextprotocol#1947 (notify writer, skip response) in a tight ~10-line change.

Github-Issue:modelcontextprotocol#1648
@wiggzz wiggzz merged commit da62715 into dbt-labs/patched-1.27.0 May 5, 2026
1 of 19 checks passed
@wiggzz wiggzz deleted the wj/DI-3218-clientdisconnect-fix-v3 branch May 5, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant