How to move to new server with zero downtime?

I have a traefik:v2.10 docker container with about 5 sites on one machine and I need to move all of that to a new machine.

With Traefik 1.7 I could simply have Ansible spin up the same set of containers on the new host, and when everything was working, switch the DNS.

With 2.1 traefik starts requesting certs as soon as the containers spin up, while the DNS is still pointing at the old host, so I'm quickly rate limited.

All I can figure at this point is to shut down traefik (disabling all of the other hosts on the new server), spin up the new host containers, switch the DNS, and then start traefik.

How do I move to a new server with zero downtime?

I understand that I can't have multiple traefik instances share the json certs file. I think I saw something about storing the certs on some kind of database, but I can't find that now.

What am I missing?

You can copy the acme.json file to the new Traefik instance, then no new certificate is created. After that switch DNS IP.

1 Like

Thanks @bluepuma77 , that's very helpful. I guess the race conditions happen only when a cert gets renewed and that's months away, so it's not an issue.

I suppose that I could also copy over just the one cert into the other server (that's also running other sites)?

The other solution would be to use DNS validation, but I don't typically control my clients' domains.

I've seen some stuff about CNAMEs that I don't quite understand.

If the client makes discourse.example.com a CNAME to example.mydomain.com is there a way to the the DNS challenge resolve to my domain, so I can do it that way? I see topics like this one. I think that won't work because I have to control more than just the single hostname, like this says:

Maybe the thing to do is switch to key/value store as described at https://www.traefik.tech/user-guide/kv-config/?