dstack-ingress: new optional SAN field on certs#86
dstack-ingress: new optional SAN field on certs#86Garandor wants to merge 3 commits intoDstack-TEE:mainfrom
Conversation
|
for future discussion: We may want to prevent traffic that arrives on the per-node domain from being accepted to prevent people from circumventing an external load balancer. A followup PR may be needed to (optionally) let |
h4x3rotab
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the contribution!
kvinwang
left a comment
There was a problem hiding this comment.
Thanks for the clean PR! A couple of notes:
-
DNS records for ALIAS_DOMAIN: The main
DOMAINgets automatic CNAME/TXT record setup, butALIAS_DOMAINis only added to the certificate and nginxserver_name— there's no DNS management for it. It would be helpful to document that users need to ensureALIAS_DOMAINresolves to the same endpoint themselves. -
Conflict with #88: Heads up — #88 replaces nginx with HAProxy and rewrites the entrypoint and nginx config generation that this PR touches. If #88 lands first, this will need to be rebased and adapted to the new HAProxy setup.
| - `PROXY_BUFFERS`: Optional value for nginx `proxy_buffers` (format: `number size`, e.g. `4 256k`) in single-domain mode | ||
| - `PROXY_BUSY_BUFFERS_SIZE`: Optional value for nginx `proxy_busy_buffers_size` (numeric with optional `k|m` suffix, e.g. `256k`) in single-domain mode | ||
| - `CERTBOT_STAGING`: Optional; set this value to the string `true` to set the `--staging` server option on the [`certbot` cli](https://eff-certbot.readthedocs.io/en/stable/using.html#certbot-command-line-options) | ||
| - `ALIAS_DOMAIN`: An additional domain to include as a Subject Alternative Name (SAN) on the TLS certificate and in nginx `server_name`. When set, the node's certificate covers both `DOMAIN` and `ALIAS_DOMAIN`, and nginx will accept requests for either hostname. |
There was a problem hiding this comment.
Worth mentioning that ALIAS_DOMAIN only affects the certificate and nginx server_name — users are responsible for setting up DNS records (e.g. CNAME) for the alias domain to point to the same endpoint. Without this, the alias domain won't actually be reachable.
kvinwang
left a comment
There was a problem hiding this comment.
A few more things after reading through the full certificate management flow:
-
DNS-01 validation requires same DNS provider: certbot needs to create
_acme-challengeTXT records for bothDOMAINandALIAS_DOMAIN. If the alias domain is managed by a different DNS provider or account, the DNS-01 challenge will fail silently. The PR description says "DNS-provider agnostic" but this isn't quite accurate — both domains must be manageable by the same provider credentials. This should be documented as a prerequisite. -
Multi-domain mode:
ALIAS_DOMAINis silently ignored when usingDOMAINSenv var, since each domain goes throughprocess_domain→renew-certificate.shindependently and onlycertman.py'scertonlypath readsALIAS_DOMAIN. Worth thinking about how this should interact with multi-domain mode — perhapsALIAS_DOMAINshould be rejected (with an error) whenDOMAINSis set, or generalized into a per-domain alias mapping.
kvinwang
left a comment
There was a problem hiding this comment.
One more thought — could this use case be served by the existing multi-domain mode instead of introducing a new ALIAS_DOMAIN mechanism?
The idea: add both domains to DOMAINS and point them to the same backend. Each domain gets its own certificate, automatic DNS record management (CNAME/TXT), and full evidence chain coverage — which ALIAS_DOMAIN currently lacks.
Current approach (nginx, on main branch):
Add both domains to DOMAINS and provide per-domain nginx configs via docker configs, both proxy_pass-ing to the same backend:
services:
ingress:
image: dstacktee/dstack-ingress:...
environment:
DOMAINS: |
primary.example.com
alias.example.com
configs:
- source: primary_conf
target: /etc/nginx/conf.d/primary.conf
- source: alias_conf
target: /etc/nginx/conf.d/alias.conf
app:
image: your-app
configs:
primary_conf:
content: |
server {
listen 443 ssl;
server_name primary.example.com;
ssl_certificate /etc/letsencrypt/live/primary.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/primary.example.com/privkey.pem;
location / { proxy_pass http://app:80; }
}
alias_conf:
content: |
server {
listen 443 ssl;
server_name alias.example.com;
ssl_certificate /etc/letsencrypt/live/alias.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/alias.example.com/privkey.pem;
location / { proxy_pass http://app:80; }
}With #88 (HAProxy L4 proxy):
Much simpler — just use ROUTING_MAP to point both domains to the same backend:
services:
ingress:
image: dstacktee/dstack-ingress:...
environment:
DOMAINS: |
primary.example.com
alias.example.com
ROUTING_MAP: |
primary.example.com=app:80
alias.example.com=app:80
app:
image: your-appThe multi-domain approach has advantages over ALIAS_DOMAIN:
- DNS records (CNAME, TXT) are automatically managed for both domains
- Both domains' certificates are included in the evidence/attestation chain
- No new code needed
The only trade-off is two separate LE certs instead of one SAN cert, but that has no practical impact since LE certs are free and auto-renewed.
What do you think? Is there a specific reason a single SAN certificate is needed here?
Adds ALIAS_DOMAIN environment variable support to dstack-ingress. When set:
This change is DNS-provider agnostic.
This PR was scoped down from #83
Thanks to the original author @wwwehr !