Can Traefik cause a downtime during dynamic configuration reload?


We are using Traefik for around 2 weeks now after a switch from an Apache Proxy setup to docker swarm based setup. The project is rather large with around 400 domains including Letsencrypt certificates and 50 separated yaml dynamic configuration files containing 190 routes and 80 middlewares.

During the switch the configuration was modified quite often, because new Domains moved to the new infrastructure, fixing typo's in the config or just adding features like rate-limiter. There could be around 20 changes per day or more - mostly because of new Domains - thus creating new certificates.

The Happening

Most of the changes went smooth without much problems. But sometimes we had response codes like 502 Bad Gateway or an exotic response like 406 Not acceptable which I think came from the application because of missing Accept Header but not sure why. The responses lasted from a few seconds to 2 minutes and was suddenly fixed again without any further changes.


To rule out Traefik and learn somthing:

How does Traefik update it's configuration? Can there be a downtime during reconfiguring middlewares, routes and containers/services somehow? Or is this just not possible, because traefik loads the whole config, creates internal structures and then does an somewhat atomic switch of everything?

Many thanks for your time and help


Traefik prepares the configuration before applying it, the switch of the routing (routers, services, middleware) is done in memory and it's instantaneous.

But before applying the configuration, Traefik needs to "prepare" it, so the changes are not instantaneous.

So there is no downtime during dynamic configuration reload.

1 Like