Skip to content

Fix #91: embed upstream response in splash HTML via internal reverse proxy#92

Open
ClaydeCode wants to merge 1 commit into
mainfrom
clayde/issue-91
Open

Fix #91: embed upstream response in splash HTML via internal reverse proxy#92
ClaydeCode wants to merge 1 commit into
mainfrom
clayde/issue-91

Conversation

@ClaydeCode
Copy link
Copy Markdown
Contributor

Summary

Closes #91.

When an app returns a 4xx/5xx error, the splash screen previously swallowed the original response body entirely. This PR implements the desired behavior: the splash looks identical to end users, but the raw HTML contains the original status code and response body in a hidden <script type="application/json" id="upstream-response"> element readable via browser dev tools.

Approach

Traefik's errors middleware discards the original response body before calling the error handler — there is no way to forward it without a custom plugin. The fix is therefore architectural: insert shard_core as a reverse proxy in the path between Traefik and app containers.

How it works

  1. A new Traefik middleware app-proxy-prefix (addPrefix: /internal/app_proxy) is added.
  2. App routers change from routing directly to each app container to routing to shard_core with the new middleware (and no more app-error middleware).
  3. The new /internal/app_proxy/{path:path} endpoint in shard_core:
    • Resolves the target container from app metadata
    • Proxies the request via httpx
    • For 4xx/5xx: captures the body, renders the splash with the upstream data embedded (hidden)
    • For 2xx/3xx: passes the response through unchanged
    • On connection error/timeout: shows the splash without upstream data (container starting up)

Security

</script> sequences in the embedded body are escaped as <\/script> to prevent injection through the JSON data island.

Changes

File Change
data/splash.html Conditionally renders <script type="application/json" id="upstream-response">
shard_core/web/internal/app_error.py Adds upstream_json field to SplashBehaviour, build_upstream_json() helper, make_splash_behaviour_for_proxy(), render_splash_response()
shard_core/web/internal/app_proxy.py New — reverse proxy endpoint
shard_core/web/internal/__init__.py Register app_proxy router
shard_core/service/traefik_dynamic_config.py Add app-proxy-prefix middleware; change app routers to use shard_core service
tests/test_app_error.py Assert no upstream element when called via legacy Traefik path
tests/test_app_proxy.py New — unit + integration tests for proxy
tests/test_traefik_dyn_spec.py Update expected middleware/router assertions

Recommended reading order

  1. data/splash.html — template change (one line)
  2. shard_core/service/traefik_dynamic_config.py — Traefik config change (routing strategy)
  3. shard_core/web/internal/app_error.py — shared helpers
  4. shard_core/web/internal/app_proxy.py — the proxy itself
  5. tests/test_app_proxy.py — tests

🤖 Generated with Claude Code

…proxy

Add an internal reverse proxy (app_proxy) that sits between Traefik and
app containers. Instead of routing app traffic directly to containers
(where Traefik's error middleware discards the response body), Traefik
now routes to shard_core's /internal/app_proxy/{path} endpoint.

The proxy:
- Forwards requests to the actual app container using httpx
- For 4xx/5xx responses: captures the full response body and renders the
  splash HTML with a hidden <script type="application/json"
  id="upstream-response"> element containing the original status code,
  content-type, and body
- For 2xx/3xx responses: passes through unchanged
- For connection errors: shows the splash without upstream data

Traefik changes:
- Add app-proxy-prefix middleware (addPrefix: /internal/app_proxy)
- App routers now use service=shard_core + middlewares=[auth, app-proxy-prefix]
  instead of the old app-specific service + [app-error, auth] setup

The app-error endpoint is kept as a legacy fallback but is no longer used
for normal app traffic. The splash template now conditionally renders the
upstream-response script tag when upstream_json is provided.

Security: </script> sequences in the body are escaped as <\/script> to
prevent injection through the JSON data island.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@max-tet max-tet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems way too complex. Traefik already sends error request to shard_core, that is how the splash works. No need for a separate endpoint. Just extend the endpoint that returns the splash screen with the upstream json. No more.

@ClaydeCode
Copy link
Copy Markdown
Contributor Author

Thanks for the review — agreed, the full reverse-proxy in this PR is too heavy. Before I rework it I want to flag a blocker I hit, because I don't think extending app_error alone can work, and I'd rather confirm the direction than ship something that misses the issue's goal.

The problem: Traefik's errors middleware never gives /internal/app_error/{status} the app's response body. When an app returns 4xx/5xx, Traefik discards the upstream response body and sends a fresh copy of the original request (headers + path only) to the error service. So app_error can see the status (it's in the path) but never the {"detail":[...]} we're trying to surface. Traefik's own errors docs confirm only the original request context is forwarded — there's no option to pass the upstream body.

On main, app routers point straight at the app container; shard_core is only pulled in after the error, on a brand-new request. It's simply never in the response path, so there's nothing for it to read.

Having app_error re-issue the request to fetch the body doesn't rescue it either: the motivating case is POST /api/shards/{id}/core_update, Traefik doesn't forward the original request body, and replaying a mutating POST would be unsafe.

So something has to sit inline on the response path to capture the body — that's the only place it exists. The PR did that with a proxy; I just overbuilt it.

Proposed lean version (drops most of what made this PR heavy):

  • Keep shard_core inline, but no new /app_proxy endpoint surface beyond what's needed — stream 2xx/3xx straight through untouched, only buffer + wrap on 4xx/5xx.
  • Drop the header-filtering machinery and extra make_splash_behaviour_for_proxy duplication; reuse the existing splash path.

Is that acceptable, or do you know a Traefik mechanism to hand the upstream body to the error handler that I've missed? I'll wire up whichever you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Splash screen: embed the real upstream response (hidden) so it's visible in dev tools

2 participants