Skip to content

dstack-ingress: new optional SAN field on certs#86

Open
Garandor wants to merge 3 commits intoDstack-TEE:mainfrom
Garandor:san_issuance
Open

dstack-ingress: new optional SAN field on certs#86
Garandor wants to merge 3 commits intoDstack-TEE:mainfrom
Garandor:san_issuance

Conversation

@Garandor
Copy link

@Garandor Garandor commented Mar 23, 2026

Adds ALIAS_DOMAIN environment variable support to dstack-ingress. When set:

  • certbot issues a SAN certificate covering both DOMAIN and ALIAS_DOMAIN (via --expand -d)
  • nginx server_name includes ALIAS_DOMAIN so requests arriving via either hostname are accepted

This change is DNS-provider agnostic.

This PR was scoped down from #83
Thanks to the original author @wwwehr !

@Garandor
Copy link
Author

Garandor commented Mar 23, 2026

for future discussion: We may want to prevent traffic that arrives on the per-node domain from being accepted to prevent people from circumventing an external load balancer.

A followup PR may be needed to (optionally) let nginx only accept on the ALIAS_DOMAIN.

Copy link
Contributor

@h4x3rotab h4x3rotab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the contribution!

Copy link
Collaborator

@kvinwang kvinwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean PR! A couple of notes:

  1. DNS records for ALIAS_DOMAIN: The main DOMAIN gets automatic CNAME/TXT record setup, but ALIAS_DOMAIN is only added to the certificate and nginx server_name — there's no DNS management for it. It would be helpful to document that users need to ensure ALIAS_DOMAIN resolves to the same endpoint themselves.

  2. Conflict with #88: Heads up — #88 replaces nginx with HAProxy and rewrites the entrypoint and nginx config generation that this PR touches. If #88 lands first, this will need to be rebased and adapted to the new HAProxy setup.

- `PROXY_BUFFERS`: Optional value for nginx `proxy_buffers` (format: `number size`, e.g. `4 256k`) in single-domain mode
- `PROXY_BUSY_BUFFERS_SIZE`: Optional value for nginx `proxy_busy_buffers_size` (numeric with optional `k|m` suffix, e.g. `256k`) in single-domain mode
- `CERTBOT_STAGING`: Optional; set this value to the string `true` to set the `--staging` server option on the [`certbot` cli](https://eff-certbot.readthedocs.io/en/stable/using.html#certbot-command-line-options)
- `ALIAS_DOMAIN`: An additional domain to include as a Subject Alternative Name (SAN) on the TLS certificate and in nginx `server_name`. When set, the node's certificate covers both `DOMAIN` and `ALIAS_DOMAIN`, and nginx will accept requests for either hostname.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth mentioning that ALIAS_DOMAIN only affects the certificate and nginx server_name — users are responsible for setting up DNS records (e.g. CNAME) for the alias domain to point to the same endpoint. Without this, the alias domain won't actually be reachable.

Copy link
Collaborator

@kvinwang kvinwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more things after reading through the full certificate management flow:

  1. DNS-01 validation requires same DNS provider: certbot needs to create _acme-challenge TXT records for both DOMAIN and ALIAS_DOMAIN. If the alias domain is managed by a different DNS provider or account, the DNS-01 challenge will fail silently. The PR description says "DNS-provider agnostic" but this isn't quite accurate — both domains must be manageable by the same provider credentials. This should be documented as a prerequisite.

  2. Multi-domain mode: ALIAS_DOMAIN is silently ignored when using DOMAINS env var, since each domain goes through process_domainrenew-certificate.sh independently and only certman.py's certonly path reads ALIAS_DOMAIN. Worth thinking about how this should interact with multi-domain mode — perhaps ALIAS_DOMAIN should be rejected (with an error) when DOMAINS is set, or generalized into a per-domain alias mapping.

Copy link
Collaborator

@kvinwang kvinwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thought — could this use case be served by the existing multi-domain mode instead of introducing a new ALIAS_DOMAIN mechanism?

The idea: add both domains to DOMAINS and point them to the same backend. Each domain gets its own certificate, automatic DNS record management (CNAME/TXT), and full evidence chain coverage — which ALIAS_DOMAIN currently lacks.

Current approach (nginx, on main branch):

Add both domains to DOMAINS and provide per-domain nginx configs via docker configs, both proxy_pass-ing to the same backend:

services:
  ingress:
    image: dstacktee/dstack-ingress:...
    environment:
      DOMAINS: |
        primary.example.com
        alias.example.com
    configs:
      - source: primary_conf
        target: /etc/nginx/conf.d/primary.conf
      - source: alias_conf
        target: /etc/nginx/conf.d/alias.conf
  app:
    image: your-app

configs:
  primary_conf:
    content: |
      server {
          listen 443 ssl;
          server_name primary.example.com;
          ssl_certificate /etc/letsencrypt/live/primary.example.com/fullchain.pem;
          ssl_certificate_key /etc/letsencrypt/live/primary.example.com/privkey.pem;
          location / { proxy_pass http://app:80; }
      }
  alias_conf:
    content: |
      server {
          listen 443 ssl;
          server_name alias.example.com;
          ssl_certificate /etc/letsencrypt/live/alias.example.com/fullchain.pem;
          ssl_certificate_key /etc/letsencrypt/live/alias.example.com/privkey.pem;
          location / { proxy_pass http://app:80; }
      }

With #88 (HAProxy L4 proxy):

Much simpler — just use ROUTING_MAP to point both domains to the same backend:

services:
  ingress:
    image: dstacktee/dstack-ingress:...
    environment:
      DOMAINS: |
        primary.example.com
        alias.example.com
      ROUTING_MAP: |
        primary.example.com=app:80
        alias.example.com=app:80
  app:
    image: your-app

The multi-domain approach has advantages over ALIAS_DOMAIN:

  • DNS records (CNAME, TXT) are automatically managed for both domains
  • Both domains' certificates are included in the evidence/attestation chain
  • No new code needed

The only trade-off is two separate LE certs instead of one SAN cert, but that has no practical impact since LE certs are free and auto-renewed.

What do you think? Is there a specific reason a single SAN certificate is needed here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants