Fix master CLI connection slot leak on client disconnect#15
Open
alexanderstephan wants to merge 1 commit intomasterfrom
Open
Fix master CLI connection slot leak on client disconnect#15alexanderstephan wants to merge 1 commit intomasterfrom
alexanderstephan wants to merge 1 commit intomasterfrom
Conversation
…nect In master-worker mode the master CLI proxy (mworker_proxy) has a hardcoded maxconn of 10. When a client connects to the master CLI socket and issues a command that gets forwarded to an unresponsive worker (e.g. one that is stuck or very slow), the connection hangs waiting for the worker's response. If the client then disconnects (timeout, Ctrl-C, etc.), the connection slot is never released because the client-side FIN is never propagated to tear down the backend. After 10 such leaked slots the master CLI socket becomes completely unreachable, returning "Resource temporarily unavailable" to any new connection attempt. The fix has three parts: 1) Remove sc_schedule_shutdown(s->scb) after command forwarding. The worker doesn't need the TCP FIN to know when a command ends - CLI commands are newline-delimited. Without this shutdown, the backend never enters half-close during normal command processing, so timeout server-fin is never implicitly armed by process_stream(). 2) In the AN_RES_WAIT_CLI early-return path, call sc_set_hcto(s->scb) only when the client has disconnected (SC_FL_EOS on scf) and there is no more pending request data. This arms the 1s server-fin timer exclusively in the stuck-worker scenario. 3) Call channel_dont_close(req) to prevent process_stream() from auto-forwarding the client's FIN to the backend via CF_AUTO_CLOSE. A 1s timeout server-fin is configured on mworker_proxy. It is only armed after the client disconnects cleanly, so it never fires during normal command processing. It ensures a stuck backend releases its connection slot promptly once the client is gone. Locally-handled commands (master applet) are unaffected because they complete synchronously before the client has reason to disconnect. The scb->ioto reset (TICK_ETERNITY) at end-of-transaction in pcli_wait_for_response() prevents any timer leakage between commands. This fixes GH issue haproxy#3351. This should be backported to all stable branches.
9081d8a to
f218e22
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.