Greetings! I'm about to deploy 6 Traefik v2 edge routers using end-to-end TLS. Soon this will grow to at least 24, probably 32. There is the obvious problem of certificate handling at scale; I can only do 50 certificates with Let's Encrypt every 2 weeks before I hit my cap and can't get more.
This brings me to two questions:
Understanding the behavior of Traefik/Let's Encrypt: If I start up a new Traefik container, and it pulls the cert generated by the previous deployment of the container, it still uses the original cert as long as it isn't expired, right? i.e. if I mess up and deploy many times, my will cert count keep climbing, or will it see the certs in acme.json aren't expired, and re-use them?
It seems the better pattern is to issue one certificate for a whole class of servers, not one certificate for each instance of a server. This would mean pulling Let's Encrypt out of Traefik, and issuing my cert when the infrastructure is deployed instead of when Traefik is launched. What are other people doing, and what strategies have worked best?
I can foresee 2 solutions using a single cert for all Traefik instances:
- Generate the cert when the infra is provisioned by Terraform and then copy it out to each Traefik instance. This could have fast performance at runtime at the expense of managing lots of file copies.
- Generating the cert when the infra is provisioned by Terraform and then copy it into a network location that can be remotely mounted by each Traefik instance. This reduces the file management/file version problem, potentially at the expense of performance due to network latency.
I think option #2 is better, unless Traefik is going to take a performance hit because of network latency. Is the cert caching strategy in Traefik such that I can store a cert across a network without regretting it?
What has worked well for others in this situation?