Skip to content

Harden HTTP streaming proxy surface#437

Open
larohra wants to merge 13 commits into
v4.xfrom
larohra/harden-auth
Open

Harden HTTP streaming proxy surface#437
larohra wants to merge 13 commits into
v4.xfrom
larohra/harden-auth

Conversation

@larohra
Copy link
Copy Markdown

@larohra larohra commented May 13, 2026

Context

  • Round-1 scope still stands: src/http/httpProxy.ts is the internal HTTP streaming bridge behind app.setup({ enableHttpStream: true }).
  • The current library implementation calls server.listen() with no host, so Node binds the proxy on a wildcard interface even though the worker advertises http://localhost:<port>/ back to the host.
  • In azure-functions-host, src/WebJobs.Script.Grpc/Channel/WorkerChannel.cs only parses HttpUri with new Uri(httpUri) and uses it for proxying, while src/WebJobs.Script/Http/DefaultHttpProxyService.cs overwrites x-ms-invocation-id before forwarding.

Assessment

  • Finding relevance: Actionable, but narrower than a generic “public unauthenticated endpoint” finding. Normal front-door callers cannot pick the correlation header because the host overwrites it, and the host does not expose HttpUri as a customer contract. The real gap is the worker’s wildcard listener plus permissive handling of direct requests from anything that can reach the sandbox/pod namespace. Even if the host never leaks the URI, the wildcard bind (and the fixed 55000-55025 fallback range) still creates discoverable local surface.
  • Will repo-only fixes satisfy the host as-is? Yes for the repo-local reachability issue: the host already assumes same-host reachability because the worker reports localhost today, and it accepts any absolute HttpUri. No for full channel authentication: the host does not send any shared secret today, and WorkerChannel starts HTTP forwarding before the worker receives the gRPC invocation, so the library cannot repo-only pre-whitelist expected invocation IDs.
  • Are fixes necessary? Yes. Binding to concrete loopback and promptly rejecting malformed proxy traffic are necessary hardening steps. Adding a new repo-only auth header/token is not viable without a coordinated host change.
  • Historical context: PR Http stream support #212 (“Http stream support”) prioritized minimal user changes and introduced the default server.listen() behavior with no evidence that wildcard binding was an intentional security posture. PR Add port validation checks for HTTP streaming #315 (“Add port validation checks for HTTP streaming”) later fixed Windows/VNET port-0 reliability but still returned localhost, which reinforces that loopback was the intended contract and wildcard binding was an implementation gap.
  • Regression / breaking-change risk: Low if the worker binds and returns the exact loopback URI and preserves PR Add port validation checks for HTTP streaming #315 rebinding behavior. The host already consumes HttpUri as an opaque absolute URI string, so returning 127.0.0.1 or [::1] is not a customer API break. The real risks are IPv4/IPv6 family selection, URI formatting, and port-0 retry behavior.
  • Current test posture: Existing round-1 evidence had the full suite green after npm ci (357 passing), but this repo still lacks direct coverage for src/http/httpProxy.ts and ProgrammingModel.getCapabilities/streamed invocation setup. The host repo does have contract-oriented coverage for correlation-header override (DefaultHttpProxyServiceTests), forwarded-header behavior (ScriptInvocationRequestTransformerTests), and the HttpUri proxying path (ScriptInvocationContextExtensionsTests).

Implementation direction

  1. Harden src/http/httpProxy.ts to bind one concrete loopback address, use it consistently for the listener and port-probe/rebind path, and fail malformed proxy requests fast.
  2. Preserve the current worker-host contract: keep HttpUri as an internal absolute URI capability and keep x-ms-invocation-id correlation semantics; do not add new repo-only auth signals or public API changes.
  3. Add proxy-focused worker tests for loopback reachability, malformed requests, correlation, response propagation, and port-0 fallback.
  4. Add capability/integration tests around ProgrammingModel and streamed invocation behavior so the hardened worker stays compatible with the current host contract and avoids address-family regressions.

Non-goal / follow-up

  • If SFI requires protection against malicious same-host peers that can reach loopback and somehow learn the correlation material, that needs a coordinated azure-functions-host + library handshake change rather than a library-only patch.

@larohra larohra requested a review from a team as a code owner May 13, 2026 20:52
@larohra
Copy link
Copy Markdown
Author

larohra commented May 13, 2026

@copilot investigate and fix failing Azure pipeline runs.

Copy link
Copy Markdown

Copilot AI commented May 13, 2026

@copilot investigate and fix failing Azure pipeline runs.

I investigated the failing pipeline checks and applied a targeted fix in 61f5201: increased the timeout for the ProgrammingModel streaming-capability test (test/ProgrammingModel.test.ts) to account for slower Windows runners. I also re-ran lint/build/tests locally and all tests are passing (377 passing).

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • azfunc.visualstudio.com
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants