Zero-downtime deployments with Docker Swarm: What's the proper way?

Hi all,

I am looking for more information about the behavior with "Traefik + Docker Swarm".

Goal

  • I would like my myrepo/alpha-app to deploy with Zero Downtime (I am using Gitlab to do this for me)

What "zero-downtime means":

  1. A single version of the application container is running
  2. A new update is published to the repo
  3. Gitlab assembles the Docker image
  4. Gitlab tells Docker Swarm to deploy a new version of the application
  5. Docker Swarm brings up the new container and connects it to the existing Docker Service
  6. Traefik should automatically choose the newest container to send requests to
  7. Docker Swarm then kills the old container

All of this above obviously can happen within milliseconds, but I want to make sure I am doing this right. The last 3 steps are the critical part.

Questions

1. Is this something is configured in Traefik or in Docker Swarm?

I know Docker Swarm has this deploy configuration for my "docker-compose.yml" file:

  app:
    image: myrepo/alpha-app
    networks:
      - web-public
    deploy:
      replicas: 1
      update_config:
        parallelism: 1
        delay: 5s
        order: start-first

2. Or is this something that I need to use "priority labels" for?

I saw on the Traefik docs for the "priority" label. I could probably set the labels to "epoch time", but not sure if I even need this.

There is also this example that does a "blue/green" but I don't know if I need that or not: bluegreen-traefik-docker/docker-stack-appli-blue.yml at master · rodolpheche/bluegreen-traefik-docker · GitHub

Configurations

docker-compose.yml
version: '3.7'
services:
  traefik:
    # Use the latest Traefik image
    image: traefik:v2.4
    networks:
        - web-public
    ports:
      # Listen on port 80, default for HTTP, necessary to redirect to HTTPS
      - target: 80
        published: 80
        mode: host
      # Listen on port 443, default for HTTPS
      - target: 443
        published: 443
        mode: host
    deploy:
      mode: global
      update_config:
        parallelism: 1
        delay: 5s
        order: start-first
      placement:
        constraints:
          # Make the traefik service run only on the node with this label
          # as the node with it has the volume for the certificates
          - node.role==manager
    volumes:
      # Add Docker as a mounted volume, so that Traefik can read the labels of other services
      - /var/run/docker.sock:/var/run/docker.sock:ro
      # Mount the volume to store the certificates
      - certificates:/certificates
    configs:
      - source: traefik
        target: /etc/traefik/traefik.yml

  app:
    image: myrepo/alpha-app
    networks:
      - web-public
    deploy:
      replicas: 1
      update_config:
        parallelism: 1
        delay: 5s
        order: start-first
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.alpha.rule=Host(`alpha.dev.test`)"
        - "traefik.http.routers.alpha.entrypoints=websecure"
        - "traefik.http.routers.alpha.tls=true"
        - "traefik.http.routers.alpha.tls.certresolver=letsencryptresolver"
        - "traefik.http.services.alpha.loadbalancer.server.port=443"
        - "traefik.http.services.alpha.loadbalancer.server.scheme=https"

volumes:
  # Create a volume to store the certificates, there is a constraint to make sure
  # Traefik is always deployed to the same Docker node with the same volume containing
  # the HTTPS certificates
  certificates:

configs:
  traefik:
    name: "traefik.yml"
    file: ./traefik.yml

networks:
  # Use the previously created public network "web-public", shared with other
  # services that need to be publicly available via this Traefik
  web-public:
    external: true
traefik.yml
# Do not panic if using a self-signed cert
serversTransport:
  insecureSkipVerify: true

### Providers
providers:
  docker:
    network: web-public
    exposedbydefault: false
    swarmMode: true
## Entry points (trust from CloudFlare)
entryPoints:
  web:
    address: ":80"
    http:
      redirections:
        entrypoint:
          to: websecure
          scheme: https
    proxyProtocol:
      trustedIPs:
        - "173.245.48.0/20"
        - "103.21.244.0/22"
        - "103.22.200.0/22"
        - "103.31.4.0/22"
        - "141.101.64.0/18"
        - "108.162.192.0/18"
        - "190.93.240.0/20"
        - "188.114.96.0/20"
        - "197.234.240.0/22"
        - "198.41.128.0/17"
        - "162.158.0.0/15"
        - "104.16.0.0/13"
        - "104.24.0.0/14"
        - "172.64.0.0/13"
        - "131.0.72.0/22"
        - "2400:cb00::/32"
        - "2606:4700::/32"
        - "2803:f800::/32"
        - "2405:b500::/32"
        - "2405:8100::/32"
        - "2a06:98c0::/29"
        - "2c0f:f248::/32"

  websecure:
    address: ":443"
    proxyProtocol:
      trustedIPs:
        - "173.245.48.0/20"
        - "103.21.244.0/22"
        - "103.22.200.0/22"
        - "103.31.4.0/22"
        - "141.101.64.0/18"
        - "108.162.192.0/18"
        - "190.93.240.0/20"
        - "188.114.96.0/20"
        - "197.234.240.0/22"
        - "198.41.128.0/17"
        - "162.158.0.0/15"
        - "104.16.0.0/13"
        - "104.24.0.0/14"
        - "172.64.0.0/13"
        - "131.0.72.0/22"
        - "2400:cb00::/32"
        - "2606:4700::/32"
        - "2803:f800::/32"
        - "2405:b500::/32"
        - "2405:8100::/32"
        - "2a06:98c0::/29"
        - "2c0f:f248::/32"

accessLog: {}
log:
  level: ERROR

api:
  dashboard: true
  insecure: true

certificatesResolvers:
  letsencryptresolver:
    # Enable ACME (Let's Encrypt): automatic SSL.
    acme:

      # Email address used for registration.
      #
      # Required
      #
      email: "me@example.test"

      # File or key used for certificates storage.
      #
      # Required
      #
      storage: "/certificates/acme.json"

      # Use a HTTP-01 ACME challenge.
      #
      # Optional
      #
      httpChallenge:

        # EntryPoint to use for the HTTP-01 challenges.
        #
        # Required
        #
        entryPoint: web

Any insight would be greatly appreciated!! :raised_hands:

1 Like

After some reading, it looks like it could be Healthcheck related? https://betterprogramming.pub/zero-downtime-deployment-with-docker-swarm-d84d8d9d9a14

The above example is with NGINX though...

Is there anything I would need to set within Traefik? (specifically the "priority" labels)

Probably the biggest problem is that you have 1 replica. Even with start first as soon as the replacement is running the old will be stopped. And yes, your container helathcheck will have to be returning healthy before traefik will balance to it.

With replicas > 1 and --update-parallelism set to a sensible number you should have replicas still available to handle requests.

Thanks for getting back @cakiwi!

The intention of replicas: 1 was to minimize memory usage for simple apps (where they really don't need to be running multiple replicas).

Looking at the documentation (Compose file version 3 reference | Docker Documentation), it sounds like the services breifly overlap.

Questions

  1. Does this mean replicas: 1 is the "hard cut-off", no matter the "update_config" setting?
  2. If I set replicas: 2, just to confirm, I do not have to do anything with Traefik priority labels? It just does this automatically?

Thanks for your help!

Hello Guys!
I've been also using start-first with Swarm. See the example. Additionally, I've been adding health check on a service level, as it is shown in this example. The example code is a little bit out of date but seems that you can refer to it and implement it accordingly.

Cheers,
Jakub

Thanks much @jakubhajek!

I was just doing some testing with a Laravel PHP app. I was able to get zero-downtime deployments with deploy.replicas: 1 and order: start-first.

I only noticed the issue with our Node containers, so I think it might be healthcheck related. Makes sense since Node probably takes longer to spin up than PHP.

I will do more testing and will keep you posted!

Hey @jaydrogers

Healthcheck probes are designed to make sure whether an app is live and ready to accept incoming requests. Kubernetes has two types liveness and readiness that allow assuming that our application is healthy and ready to accept requests.

Docker also has health checks implemented but on the Traefik level, you have to check whether your backend is ready. Here is a link to the official Healthcheck documentation.

Cheers, Jakub

1 Like

I see a problem with examples given by @jakubhajek.

After a successful update, when docker swarm will start to shutdown old container, traefik will see it only in one of next healthchecks.
If request comes just before this "unhealthy" healthcheck, it may be load-balanced to shutting-down container, which may not be able to process this request anymore. It will result in error for client.

Or maybe traefik is notified about shutting down of old container before it makes its healthcheck?

1 Like

Is there any resolution for that? Because currently I do not see the proper way how to implement zero downtime rolling update in docker swarm, because Traefik does not know about the services being shutting down...

I tried to find some appropriate changes in the release notes but there are nothing in this regard.

A few years later on this and the only way I've been able to get zero-downtime is setting:

    deploy:
      replicas: 2

Unless something has changed in Traefik since my original post, I am not aware of it working with replicas: 1.

Would love to hear other thoughts if people have them.

I posted this into the "#swarm" Discord channel on "devops.fan" and this is what Martin Braun had to say:

For proper zero downtime you have to tell traefik to use the virtual ip. Look for lbswarm in the settings. Healthchecks are mandatory for this to work properly.

Bret has a good Video on it https://www.youtube.com/live/dLBGoaMz7dQ?feature=share

But with traefik you have to use lbswarm in my experience as that works well with docker healthchecks. Otherwise you have to set up healthchecks somehow in traefik as well.

lbswarm (Traefik configuration)

This was a very helpful direction and I will test this out soon!

The first question is about request duration. For short lived requests, it is usually not a problem.

We use a NodeJS application that will not accept new connections when the according signal is sent, ongoing requests will be finished. Docker usually waits 10 secs for a container to exit, then kills it.

Use parameters like --update-order start-first, --update-delay and --update-parallelism to always have another running container.

Be aware that Traefik Docker Swarm provider has a default poll interval of 15 secs, so only then new containers are picked up, so your updates have to be "slower". (Doc)

Check the Docker service update doc, similar parameters are available for Docker stack compose file.

By the way I found the solution with no changes in traefik at all.

The way is that:

  1. In the container I have usual healthcheck (/health) and I added 'special' healthcheck route especially for traefik (/health/lb).

  2. The application has graceful shutdown logic. When any signal comes, then BEFORE we switch the express server off (for node.js apps), I say to this special healthcheck to start returning 500 codes during the next 11 seconds. And ONLY AFTER this 11 seconds I start the procedure of graceful shutdown. So within these 11 seconds the app is still functioning as normal.

  3. In the services definition I added something like this:

     # Define the healthcheck config
     - "traefik.http.services.myapi123.loadbalancer.healthcheck.path=/health/lb"
     - "traefik.http.services.myapi123.loadbalancer.healthcheck.interval=5s"
     - "traefik.http.services.myapi123.loadbalancer.healthcheck.timeout=2s"
     - "traefik.http.services.myapi123.loadbalancer.healthcheck.scheme=http"
    

So as you see traefik always checks the app for health with this 'special' route every 5 seconds..

  1. I also increased the grace period because by default it is 10 sec but in our case we need more that 11 seconds to exit gracefully.

    stop_grace_period: 20s

So when docker starts shutting the container down, we first inform traefik that it is unhealthy. Traefik stops to balance the trafik to this container, and then we safely allow the container to die. Of course the update order should start new container first, like this:

  update_config:
    failure_action: rollback
    parallelism: 1
    delay: 1s
    order: start-first
  rollback_config:
    parallelism: 0
    order: stop-first
  restart_policy:
    condition: any
    delay: 10s
    max_attempts: 3
    window: 120s

Thats it! Does not matter how many replicas do we have. I tested this approach with jmeter on one replica by sending thousands of api calls per second and continuously updated the app. This is absolutelly ZERO downtime approach with only one drawback - the time to shutdown in more than 11 seconds. But this is not a problem at all!

2 Likes

How are you going to use start-first order for traefik that has ports?

When you have ports, you cannot run 2 traefik containers on the same machine because they both need the same port (i.e. 2nd container will be pending with a message no suitable node ... host-mode port already in use on 1 node). It means order: start-first just does not make any sense.

At least this is the behaviour I get. Could maybe somebody explain what I am doing wrong?

It depends on how you use the ports. When a Docker Swarm service (like Traefik) just declares a port, the the Docker ingress network is used, the port is available on all nodes and connections are forwarded by Docker to an available container, probably round robin.

If you open the port in host mode (see simple Traefik Swarm example), then the local container is using the port exclusively and you need a stop-first policy. We use this, having an externally managed load balancer in front of the Traefik nodes, to ensure HA even during a Traefik upgrade.

1 Like

Thank you for your reply.

The example docker compose (in the first post in this thread) contains mode: host ports. And based on this example, people discuss here order: start-first approach. So, I am confused now

The start-first is for the target services, which don't use an external port.

1 Like

Will start-first police work with mode: ingress ports?

It should work with Traefik and target apps, as in that case the port is not really used by a container, but Docker creates its own ingress network on the port to distribute requests internally.

This should enable zero downtime Traefik deployments. But note that this is not HA, as the node can still fail.

1 Like