Traefik stops routing after some minutes (Docker Swarm)

Traefik stops routing after some minutes. (Docker Swarm) I suspect it's when a service (any service) gets redeployed

I'm running 3 nodes in Docker Swarm When I restart each server, the problem is fixed, until I redeploy the service or just wait a little time until one service gets redeployed automatically
I'm running Ubuntu 22.04 on two servers, and 22.10 on one

It seems like no request reaches traefik. In traefik logs there are no logs when it comes to routing.

It's really weird to me, because I can access Traefik dashboard, and every route is in there

Here's what I've done over the past 4 days:

  1. Reviewed entrypoints setup
  2. Added lbswarm=true label to every container (this didn't fix the issue)
  3. Edited /etc/resolv.conf on each node to make sure it points to my PiHole instance and not router
  4. Re-read Documentation two times ( :exploding_head: )
  5. Made sure that the ports: 80 in docker-compose.yml stack is only on Traefik Container
  6. Examined traefik logs (I know there's one related to lack of Middleware, but It's not related I think, since the container stops routing about 30 minutes after start-up)
  7. Copied my rootCA.pem certificate to /etc/ssl/certs on each node
  8. Set up default swarm: network: web in traefik.conf
  9. Switched to not using :latest image
    I'm exhausted… ;_;

dashy-stack.yml:

version: "3.5"

services:
  dashy:
    image: lissy93/dashy:latest
    container_name: dashy
    networks:
      - web
      - dashy
    volumes:
      - /home/swarm/dashy/dashy-conf.yml:/app/user-data/conf.yml
      - /etc/ssl/certs:/etc/ssl/certs
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Warsaw
    deploy:
      replicas: 3
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.dashy.rule=Host(`dashy.home`)"
        - "traefik.http.services.dashy.loadbalancer.server.port=8080"
        - "traefik.docker.network=web"
        - "io.portainer.accesscontrol.users=admin"
        - "traefik.http.routers.dashy.tls=true"
        - "traefik.http.routers.dashy.entrypoints=websecure"
        - "traefik.docker.lbswarm=true" # Disables load-balancing in Traefik and delegates it to Docker Swarm. This still doesn't fix the problem


networks:
  dashy:
    driver: overlay
    attachable: true
    name: dashy
  web:
    external: true
    name: web

traefik-stack.yml

services:
  reverse-proxy:
    image: traefik:3.0
    ports:
      - "80:80"
      - "8080:8080" # For Traefik dashboard
      - "443:443"
    #  - "222:222"
    volumes:
      - "/home/swarm/traefik/traefik-conf.yml:/etc/traefik/traefik.yml"
      - /var/run/docker.sock:/var/run/docker.sock
      - "/home/swarm/traefik/configuration/:/configuration/"
      - "/home/swarm/traefik/certs/:/certs/"
    environment:
      - TZ=Europe/Warsaw
    networks:
      - web

    deploy:
      labels:
      - "traefik.enable=true"
      - "traefik.http.routers.api.rule=Host(`traefik.home`)"
      - "traefik.http.routers.api.service=api@internal"
      - "traefik.http.routers.api.middlewares=auth"
      - "traefik.http.middlewares.auth.basicauth.users=admin:$$apr1$$iB4qdp8O$$I2l4qaWhHUCWGyl7lkSEa/"
      # Dummy service for Swarm port detection. The port can be any valid integer value.
      - "traefik.http.services.dummy-svc.loadbalancer.server.port=9999"
      mode: global
      placement:
        constraints: [node.role == manager]

networks:
  web:
    driver: overlay
    attachable: true
    name: web

traefik-conf.yml

### Static Configuration
http:
  middlewares:
    my-sablier: # For Sablier to Work
      plugin:
        sablier:
          group: non-essential
          dynamic:
            displayName: The service starts...
            refreshFrequency: 5s
            showDetails: "true"
            theme: ghost
          sablierUrl: https://sablier:10000
          sessionDuration: 1m

log:
  level: DEBUG
api:
  dashboard: true

serversTransport:
  insecureSkipVerify: true
  rootCAs:
    - /certs/rootCA.pem

certificatesResolvers:
  rootCAs:
    - /certs/rootCA.pem
entryPoints:
  web:
    address: ":80"
  gitea-ssh:
    address: ":222"
  websecure:
    address: ":443"

providers:
  swarm:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false
    network: web
  file:
    directory: /configuration/
    watch: true
# For Sablier to work:
experimental:
  plugins:
    sablier:
      moduleName: "github.com/acouvreur/sablier"
      version: "v1.6.1"
$  sestatus
SELinux status: disabled

Snippet of traefik logs:

guration/","watch":true},"providersThrottleDuration":"2s","swarm":{"defaultRule":"Host(`{{ normalize .Name }}`)","endpoint":"unix:///var/run/docker.sock","network":"web","refreshSeconds":"15s","watch":true}},"serversTransport":{"insecureSkipVerify":true,"maxIdleConnsPerHost":200,"rootCAs":["/certs/rootCA.pem"]},"tcpServersTransport":{"dialKeepAlive":"15s","dialTimeout":"30s"}}
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:25+02:00 INF github.com/traefik/traefik/v3/cmd/traefik/traefik.go:605 >
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | Stats collection is disabled.
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | Help us improve Traefik by turning this feature on :)
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | More details on: https://doc.traefik.io/traefik/contributing/data-collection/
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    |
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:25+02:00 DBG github.com/traefik/traefik/v3/pkg/plugins/plugins.go:30 > Loading of plugin: sablier: github.com/acouvreur/sablier@v1.6.1
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:25+02:00 DBG github.com/hashicorp/go-retryablehttp@v0.7.5/client.go:612 > Performing request method=GET url=https://plugins.traefik.io/public/download/github.com/acouvreur/sablier/v1.6.1
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:26+02:00 DBG github.com/hashicorp/go-retryablehttp@v0.7.5/client.go:612 > Performing request method=GET url=https://plugins.traefik.io/public/validate/github.com/acouvreur/sablier/v1.6.1
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 INF github.com/traefik/traefik/v3/pkg/server/configurationwatcher.go:73 > Starting provider aggregator aggregator.ProviderAggregator
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/server/server_entrypoint_tcp.go:220 > Starting TCP Server entryPointName=gitea-ssh
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/server/server_entrypoint_tcp.go:220 > Starting TCP Server entryPointName=web
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/server/server_entrypoint_tcp.go:220 > Starting TCP Server entryPointName=websecure
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 INF github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:202 > Starting provider *file.Provider
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:203 > *file.Provider provider configuration config={"directory":"/configuration/","watch":true}
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/provider/file/file.go:122 > add watcher on: /configuration/
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/provider/file/file.go:122 > add watcher on: /configuration/certificates.yml
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 INF github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:202 > Starting provider *traefik.Provider
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:203 > *traefik.Provider provider configuration config={}
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 INF github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:202 > Starting provider *acme.ChallengeTLSALPN
traefik-swarm_reverse-proxy.0.tlnifr2qp9c9@swarm3    | 2024-05-20T20:14:27+02:00 DBG github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:203 > *acme.ChallengeTLSALPN provider configuration config={}

$ curl -vk https://dashy.home
* Host dashy.home:443 was resolved.
* IPv6: (none)
* IPv4: 192.168.1.140
*   Trying 192.168.1.140:443...
* Connected to dashy.home (192.168.1.140) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_CHACHA20_POLY1305_SHA256 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: O=mkcert development certificate; OU=cloufish@carbs
*  start date: May 17 17:35:34 2024 GMT
*  expire date: Aug 17 17:35:34 2026 GMT
*  issuer: O=mkcert development CA; OU=cloufish@carbs; CN=mkcert cloufish@carbs
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://dashy.home/
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: dashy.home]
* [HTTP/2] [1] [:path: /]
* [HTTP/2] [1] [user-agent: curl/8.7.1]
* [HTTP/2] [1] [accept: */*]
> GET / HTTP/2
> Host: dashy.home
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off

**5 MINUTES PASSES BY AND STILL NO RESPONSE...**
$ docker service ps traefik-swarm_reverse-proxy
ID             NAME                                                        IMAGE            NODE         DESIRED STATE   CURRENT STATE                ERROR
           PORTS
ubip1a82ce10   traefik-swarm_reverse-proxy.hxi0j4zt4e28a20vdv0c3ljee       traefik:3.0      swarm2       Running         Running 23 minutes ago

eyzt6tw8em8x    \_ traefik-swarm_reverse-proxy.hxi0j4zt4e28a20vdv0c3ljee   traefik:latest   swarm2       Shutdown        Shutdown 23 minutes ago

bn8wkkny29p0    \_ traefik-swarm_reverse-proxy.hxi0j4zt4e28a20vdv0c3ljee   traefik:latest   swarm2       Shutdown        Shutdown 23 minutes ago

n85xkrslujom    \_ traefik-swarm_reverse-proxy.hxi0j4zt4e28a20vdv0c3ljee   traefik:latest   swarm2       Shutdown        Shutdown 3 hours ago

kxuxdpz6jytd    \_ traefik-swarm_reverse-proxy.hxi0j4zt4e28a20vdv0c3ljee   traefik:latest   swarm2       Shutdown        Complete 3 hours ago

yhql9i6shuv3   traefik-swarm_reverse-proxy.kdc3cszncz98z653q979x5hzf       traefik:3.0      swarm3       Running         Running 22 minutes ago

qyaq317oiqas    \_ traefik-swarm_reverse-proxy.kdc3cszncz98z653q979x5hzf   traefik:3.0      swarm3       Shutdown        Shutdown 22 minutes ago

dg8040zh6st5    \_ traefik-swarm_reverse-proxy.kdc3cszncz98z653q979x5hzf   traefik:latest   swarm3       Shutdown        Shutdown 22 minutes ago

mdi1njmd55w8    \_ traefik-swarm_reverse-proxy.kdc3cszncz98z653q979x5hzf   traefik:latest   swarm3       Shutdown        Complete 22 minutes ago

ej596pisq0ot    \_ traefik-swarm_reverse-proxy.kdc3cszncz98z653q979x5hzf   traefik:latest   swarm3       Shutdown        Failed 3 hours ago           "No such container: traefik-sw…"
hjuywy3t832q   traefik-swarm_reverse-proxy.qftbhpitrahchanm4e4cjuj6z       traefik:3.0      swarm-lite   Running         Running 23 minutes ago

uuchh9x7wx4b    \_ traefik-swarm_reverse-proxy.qftbhpitrahchanm4e4cjuj6z   traefik:latest   swarm-lite   Shutdown        Shutdown 23 minutes ago

o8wtt4t68eu6    \_ traefik-swarm_reverse-proxy.qftbhpitrahchanm4e4cjuj6z   traefik:latest   swarm-lite   Shutdown        Complete 23 minutes ago

vxct3wq2nxvz    \_ traefik-swarm_reverse-proxy.qftbhpitrahchanm4e4cjuj6z   traefik:latest   swarm-lite   Shutdown        Shutdown about an hour ago

vrxjij6fwdy9    \_ traefik-swarm_reverse-proxy.qftbhpitrahchanm4e4cjuj6z   traefik:latest   swarm-lite   Shutdown        Shutdown 15 hours ago

It seems you have a few customizations. Did it work with Traefik v2? Did you try without them in Traefik v3?

Thank you @bluepuma77 for your participation and trying to help!!!

I didn't run v2 version of Traefik in a 3-Node Cluster. Only on single node.
But I'll try it now. v2.11.2
(I'll only switch the definition of docker swarm provider to the older docker.swarmMode=true

30 minutes later

It does works, that's crazy!
I'll be waiting until v3 gets more adapted by more people

If it doesn’t work with v3, why not create a Github issue? But check the migration guide first :slight_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.