I am trying to figure out the right way to deploy code updates on a docker stack.
I am running a few services in docker containers behind traefik (v2.6.1), some of which are built from internal code on each deployment.
Currently, I deploy updates with
docker stack deploy -c docker-compose.yml <mystack> after rebuilding the docker images I need. While this works, the services is unavailable for more than 10 minutes each time, even excluding image build time. Since I'm deploying several times a day, this adds up to a lot of downtime. In the traefik logs, I'm seeing a lot of output about traefik starting up, checking letsencrypt certs, etc... In the meantime, I'm getting
gateway timeout errors while trying to access my services from the web.
What is the proper way to deploy updates to my services behind traefik? I feel like there should be a way to refresh service containers without having to touch the
traefik container, but I've not been able to find anything in the docs or this forum.
To add a bit of context, the stack is currently running on a single node docker swarm, and the setup is fully working.
Thanks for any tips!
Can you please share your docker-compose.yaml file that is being used?
The good practice is to use more than one stack, e.g the first one only with proxy and the another once where your application is configured.
See one of my old examples to where I created more than one stack:
The example is out of date but you can see the general idea with more stacks.
Hello @jakubhajek ,
Thanks a lot for pointing out your example config. I've been able to find inspiration from it and (I think) fix my problem.
A few things I had wrong/learned in the process:
- I was mixing
docker swarm, specifically I was using
docker stack deploy with the docker provider, not realising that I had to tell traefik that I was using swarm (see this if you come here in the future looking for a similar answer). And my labels were under the containers, not the
deploy key, so that couldn't work.
- I was missing the
traefik.docker.network label on my services, which meant that traefik was randomly routing through the wrong network and giving me timeouts. (see Docker - Traefik)
- I had no proper
deploy config, and no healthchecks
- I stopped using
latest tags on my docker images, and use the git SHA hash instead, which solves a lot of problems like image updates not being pulled, etc...
- I've also split my config into 2 stacks as you suggested, and only run
docker stack deploy on my app stack on each deployment. I run the command on the traefik stack only if the config/version changes there.
Overall, your example config was super helpful. A couple of suggestions to make it even more useful:
- Add a README giving an overview of how it works, why 2 stacks, etc...
- Comment each line/label, at least with a link to the relevant doc section
Glad to hear that you found the root cause of the issue. I encourage you to have a look at our YouTube Channel, there are a lot of great videos produced by our community members. You can learn and get inspiration how Traefik is used and what problems it solves.
Regarding the example repo I shared, seems that I explain in that recording step by step how it everything works. Again, it is quite old but I hope you find some value there.