feat(portal): optionally terminate tls in prod by jamilbk · Pull Request #12689 · firezone/firezone
Summary
We're migrating from Azure L7 load balancers to L4 (TCP passthrough) load balancers. L4 load balancers don't terminate TLS, so the Phoenix application needs to handle TLS termination itself.
This PR introduces Portal.CertCache, a GenServer that caches parsed TLS certificates in memory, and wires it into Bandit's sni_fun callback so certificates are served on every TLS handshake without touching disk.
How it works
Portal.CertCacheis a GenServer that accepts afetch_fn(0-arity function returning PEM data), parses PEM into DER format on init, and holds the parsed cert chain + key in process state.- Two named instances are started —
Portal.CertCache.WebandPortal.CertCache.Api— since the web and API endpoints serve different TLDs with separate certificates. Each instance gets a unique supervisor child ID viachild_spec/1. - Bandit's
sni_funcallback callsPortal.CertCache.get_opts/1on each TLS handshake to retrieve the cached DER certs. Each endpoint points itssni_funat its own CertCache instance. Thesni_funis passed throughthousand_island_options: [transport_options: [sni_fun: ...]]since Bandit delegates SSL options to Thousand Island's transport layer. - Enablement is driven by port env vars — setting
PHOENIX_HTTPS_WEB_PORTenables HTTPS on the web endpoint,PHOENIX_HTTPS_API_PORTon the API endpoint. When unset, endpoints continue to serve HTTP only. Both HTTP and HTTPS can run simultaneously (useful for health checks). CertCache.refresh/1allows updating certs at runtime without restart. On refresh failure, the stale cert is retained and a warning is logged. On init failure, the GenServer crashes (no stale cert to fall back on), which prevents the endpoint from accepting traffic until certs are available.- PortalOps stays HTTP-only — it's internal and doesn't need TLS.
Dev environment
In dev, both CertCache instances read from the existing self-signed certs at priv/cert/. The PortalWeb endpoint's config was updated from static certfile/keyfile to sni_fun, so dev now exercises the same code path as production.
Future work
- Azure Key Vault integration (next PR): Replace the file-based
fetch_fnwith an API client that fetches certs from Azure Key Vault using Managed Identity, with periodic refresh. - Cert rotation: The
refresh/1API is already in place — the Key Vault PR will add a timer to periodically re-fetch and call it.
Test plan
-
Portal.CertCacheTest— 7 tests covering PEM parsing, init, init failure, refresh, and refresh failure -
PortalWeb.EndpointTest— starts Bandit HTTPS withsni_fun→ CertCache, connects via:ssl, verifies returned cert DER matches -
PortalAPI.EndpointTest— same as above with a separate cert (CN=api-test), verifying per-endpoint cert isolation - Full test suite passes (2929 tests, 0 failures)
- Dialyzer passes (0 errors)
- Manual:
mix phx.server→ verifyhttps://localhost:13443works viasni_fun - Deploy with
PHOENIX_HTTPS_WEB_PORTandPHOENIX_HTTPS_API_PORTset, verify TLS termination
🤖 Generated with Claude Code