Skip to content

fix(streamable_http): handle ClientDisconnect during POST at WARNING level#2

Closed
wiggzz wants to merge 115 commits intodbt-labs/patched-1.27.0from
wj/DI-3218-clientdisconnect-fix
Closed

fix(streamable_http): handle ClientDisconnect during POST at WARNING level#2
wiggzz wants to merge 115 commits intodbt-labs/patched-1.27.0from
wj/DI-3218-clientdisconnect-fix

Conversation

@wiggzz
Copy link
Copy Markdown

@wiggzz wiggzz commented May 5, 2026

Problem

When a client disconnects mid-POST (network timeout, cancel, LB drop), the MCP SDK's _handle_post_request catches ClientDisconnect in its generic except Exception handler, which:

  1. Logs an ERROR ("Error handling POST request") to Datadog — this is pattern 1, 67% of all prod ERRORs in ai-codegen-api
  2. Synthesizes a 500 response and sends it to the already-closed socket
  3. Does not notify the inner session writer, so stateful session tasks can hang

Solution

Catch ClientDisconnect before the generic except Exception handler in _handle_post_request:

  • Log at WARNING — honest level: we aborted handling a request, but it wasn't our fault
  • Notify the session writer with ClientDisconnect() so inner session tasks unblock cleanly (stateful sessions)
  • Skip sending a response — the socket is closed; sending to it is just noise
  • Scope: POST only — Datadog confirms 100% of pattern 1 stacks originate from _handle_post_request

Tests

4 new unit tests in tests/server/streamable_http/test_client_disconnect_post.py:

  • test_client_disconnect_logs_warning_not_error — verifies WARNING, not ERROR
  • test_client_disconnect_does_not_send_response — verifies no ASGI sends to closed socket
  • test_client_disconnect_notifies_writer — verifies writer receives ClientDisconnect
  • test_client_disconnect_writer_suppresses_errors — verifies broken writer doesn't crash us

All existing streamable_http tests (49 + 18) pass.

Upstream context

This patch combines the approach of two open upstream PRs (neither merged):

This is a tight ~10-line change taking the spirit of modelcontextprotocol#1647 + semantics of modelcontextprotocol#1947. Tracking issue: modelcontextprotocol/python-sdk#1648.

Follow-up

After this lands in the fork and is verified in prod via ai-codegen-api bump:

  • Open mirror PR upstream
  • Consider dropping ClientDisconnectHandlerMiddleware in ai-codegen-api (no longer load-bearing for POST path)

yugannkt and others added 30 commits December 19, 2025 11:17
…#1481)

Co-authored-by: Max Isbey <224885523+maxisbey@users.noreply.github.com>
…col#1503)

Co-authored-by: Paul Carleton <paulcarletonjr@gmail.com>
…textprotocol#1808)

Co-authored-by: Max Isbey <224885523+maxisbey@users.noreply.github.com>
…xtprotocol#1826)

Co-authored-by: Jay Hemnani <jayhemnani9910@gmail.com>
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
…col#1840)

Co-authored-by: Jacem Elwaar <jacem@mcpappsbuilders.com>
Kludex and others added 27 commits February 7, 2026 14:12
…across 1 directory (modelcontextprotocol#2031)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
…ross 1 directory (modelcontextprotocol#2036)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
…delcontextprotocol#2038)

Co-authored-by: Lee Hubbard <hubbard.zlee@unknowncyber.com>
Co-authored-by: Max Isbey <224885523+maxisbey@users.noreply.github.com>
…level

Catch ClientDisconnect in _handle_post_request so client disconnections
(network timeout, cancel, LB drop) log at WARNING instead of ERROR and do
not attempt to send a response to the closed socket.

This fixes pattern 1 (67% of all prod ERRORs in ai-codegen-api): when the
client disconnects mid-POST, the current except-Exception handler synthesizes
a 500 response to a closed socket and logs ERROR-level noise to Datadog.

Changes:
- Catch ClientDisconnect before the generic except-Exception handler
- Log at WARNING (honest level: we aborted but it wasn't our fault)
- Notify the session writer so inner session tasks unblock cleanly
- Suppress errors from writer.send() since the writer may already be closed
- Skip sending a response (the socket is gone)

Upstream context:
- modelcontextprotocol#1647 (POST-only scope, right approach)
- modelcontextprotocol#1947 (writer-notification semantics, skip-response)
- modelcontextprotocol#1648 (tracking issue, still open, no PRs merged)

This patch takes the spirit of modelcontextprotocol#1647 (POST-only scope) + the semantics of
modelcontextprotocol#1947 (notify writer, skip response) in a tight ~10-line change.

Github-Issue:modelcontextprotocol#1648
@wiggzz
Copy link
Copy Markdown
Author

wiggzz commented May 5, 2026

Closing to fix base branch — re-opening with only our commit.

@wiggzz wiggzz closed this May 5, 2026
@wiggzz wiggzz deleted the wj/DI-3218-clientdisconnect-fix branch May 5, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.