Lack of sample for a working swarm stack in V3

Hi all.
After 3 day of search, mostly looping on the same page, in stay without any solution for setting up a little stack that on only involve traefik and its whoami service.
I've setup a swarm cluster. I think it works well but perhaps I've missed something there.
I've create an attachable overlay network
Then I've started with a docker compose service that works perfectly ... But only on the node I declare it.
Here is the code used (not sure how to enter code here, I hope it will be readable)

services:
traefik:
image: traefik:{{ trfversion }}
restart: unless-stopped
networks:
trf-{{ clustername }}:
security_opt:
- no-new-privileges:true
command:
- --accesslog=true
- --accesslog.filepath=/tmp/access.log
- --log=true
- --log.level=TRACE
- --log.filepath=/tmp/traefik.log
- --api=true
- --api.dashboard=true
- --api.debug=true
- --api.disabledashboardad=true
- --api.insecure=true
- --providers.docker=true
- --providers.docker.exposedbydefault=true
- --providers.docker.network=trf-{{ clustername }}
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.file.directory=/etc/traefik/dynamic_conf
- --entryPoints.http.address=:80
- --entrypoints.http.http.redirections.entryPoint.to=https
- --entrypoints.http.http.redirections.entryPoint.scheme=https
- --entrypoints.http.http.redirections.entrypoint.permanent=true
- --entrypoints.https.address=:443
- --serverstransport.insecureskipverify=true
ports:
- "80:80"
- "443:443"
volumes:
- /etc/localtime:/etc/localtime:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /mnt/traefik/data/certconf.yml:/etc/traefik/dynamic_conf/conf.yml:ro
- /mnt/traefik/certs:/certs
- /mnt/traefik/logs/:/tmp/
labels:
- "traefik.enable=true"
- "traefik.http.routers.traefik.rule=Host({{ traefik_fqdn }})"
- "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=https"
- "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"
- "traefik.http.routers.traefik.middlewares=traefik-https-redirect"
- "traefik.http.routers.traefik-secure.entrypoints=https"
- "traefik.http.routers.traefik-secure.rule=Host({{ traefik_fqdn }})"
- "traefik.http.routers.traefik-secure.tls=true"
- "traefik.http.routers.traefik-secure.service=api@internal"

whoami:
image: "traefik/whoami"
networks:
trf-{{ clustername }}:
labels:
- "traefik.enable=true"
- "traefik.http.routers.whoami.rule=Host(whoami.xcorp.net)"
- "traefik.http.routers.whoami.entrypoints=https"
- "traefik.http.routers.whoami.tls=true"
- "traefik.http.services.whoami-service.loadbalancer.server.port=80"

networks:
trf-{{ clustername }}:
external: true

I certainly used more option than needed but this allow me to be sure on the settings
Like I said everything works, I can reach the fqdn of traefik in http/https which redirect to the dashboard and reaching the whoami url succeed also.

Now from there I try to make it a little more HA and convert the code to a stack to deploy.
The modification involved mainly to move the labels under the deploy tag and add the "swarm related" command parameters in the command: section.
The documentation is so bad for that on V3 ... My understanding is that you introduce a new provider but you still use the docker provider.

Anyway, when I deploy the stack, nothing work anymore. Same unmodified cluster, both container fixed to the first node for debugging purpose.
I get no error in the traefik logs that I monitor. I have regular update saying :slight_smile:

2024-08-06T09:38:12Z DBG github.com/traefik/traefik/v3/pkg/server/configurationwatcher.go:227 > Configuration received config={"http":{"middlewares":{"sslheader":{"headers":{"customRequestHeaders":{"X-Forwarded-Proto":"https"}}},"traefik-https-redirect":{"redirectScheme":{"scheme":"https"}}},"routers":{"traefik":{"middlewares":["traefik-https-redirect"],"rule":"Host(proxdnl.xcorp.net)","service":"dashboard"},"traefik-secure":{"entryPoints":["https"],"rule":"Host(proxdnl.xcorp.net)","service":"api@internal","tls":{}},"whoami":{"entryPoints":["https"],"rule":"Host(whoami.xcorp.net)","service":"whoami-service","tls":{}}},"services":{"dashboard":{"loadBalancer":{"passHostHeader":true,"responseForwarding":{"flushInterval":"100ms"},"servers":[{"url":"http://30.20.0.11:8080"}]}},"whoami-service":{"loadBalancer":{"passHostHeader":true,"responseForwarding":{"flushInterval":"100ms"},"servers":[{"url":"http://30.20.3.140:80"}]}}}},"tcp":{},"tls":{},"udp":{}} providerName=swarm

Nothing in the access.log and an ERR_CONNECTION_REFUSED on each url.

I'm lost.
An advise on how I can debug this ?
Could it be my traefik config that miss somthing or more a problem on my swarm network routing ?

By advance thanks for all the help you could provide
Regards
Stef

Sorry I forgot to add the modified stack.yml made from the original working docker compose

Here it is

services:
traefik:
image: traefik:{{ trfversion }}
networks:
trf-{{ clustername }}:
command:
- --accesslog=true
- --accesslog.filepath=/tmp/access.log
- --log=true
- --log.level=TRACE
- --log.filepath=/tmp/traefik.log
- --api=true
- --api.dashboard=true
- --api.debug=true
- --api.disabledashboardad=true
- --api.insecure=true
- --providers.swarm=true
- --providers.swarm.endpoint=unix:///var/run/docker.sock
- --providers.docker=true
- --providers.docker.exposedbydefault=true
- --providers.docker.network=trf-{{ clustername }}
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.file.directory=/etc/traefik/dynamic_conf
- --entryPoints.http.address=:80
- --entrypoints.http.http.redirections.entryPoint.to=https
- --entrypoints.http.http.redirections.entryPoint.scheme=https
- --entrypoints.http.http.redirections.entrypoint.permanent=true
- --entrypoints.https.address=:443
- --serverstransport.insecureskipverify=true
ports:
- "80:80"
- "443:443"
volumes:
- /etc/localtime:/etc/localtime:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /mnt/traefik/data/certconf.yml:/etc/traefik/dynamic_conf/conf.yml:ro
- /mnt/traefik/certs:/certs
- /mnt/traefik/logs/:/tmp/
deploy:
placement:
constraints:
- node.hostname == cldnl-mgt-01
restart_policy:
condition: on-failure
labels:
- "traefik.enable=true"
- "traefik.http.routers.traefik.rule=Host({{ traefik_fqdn }})"
- "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=https"
- "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"
- "traefik.http.routers.traefik.middlewares=traefik-https-redirect"
- "traefik.http.routers.traefik-secure.entrypoints=https"
- "traefik.http.routers.traefik-secure.rule=Host({{ traefik_fqdn }})"
- "traefik.http.routers.traefik-secure.tls=true"
- "traefik.http.routers.traefik-secure.service=api@internal"
- "traefik.http.routers.traefik.service=dashboard"
- "traefik.http.services.dashboard.loadbalancer.server.port=8080"

whoami:
image: "traefik/whoami"
networks:
trf-{{ clustername }}:
deploy:
placement:
constraints:
- node.hostname == cldnl-mgt-01
restart_policy:
condition: on-failure
labels:
- "traefik.enable=true"
- "traefik.http.routers.whoami.rule=Host(whoami.xcorp.net)"
- "traefik.http.routers.whoami.entrypoints=https"
- "traefik.http.routers.whoami.tls=true"
- "traefik.http.services.whoami-service.loadbalancer.server.port=80"

networks:
trf-{{ clustername }}:
external: true

Check updated simple Traefik Swarm example.

Thanks, I'll check that sample :+1:

Hi All
It wasn't that easy but the sample gave me some hints on what was going on.
The 2 major point that made me fix the problem was

  • Traefik need to be deployed in global mode if you want to reach it from any node of the cluster randomly (if you have a HA system in front). That's because otherwise the port are not expose and thus the packet can't flow to the proxy.
  • Then important also, is that ports need to be exposed in host mode.

So in summary here is below a working sample. I'm not using external certificates provider. In place I have my own cert and key pointed by a dynamic conf ... But that's not the point. You can change this with any acme config.
In the sample also my overlay network is created ahead.

Sorry for the formating of the following lines.

services:
traefik:
image: traefik:{{ trfversion }}
hostname: '{{ ansible_fqdn }}'
networks:
trf-{{ clustername }}:
command:
- --accesslog=true
- --accesslog.filepath=/tmp/access.log
- --log=true
- --log.level=TRACE
- --log.filepath=/tmp/traefik.log
- --api=true
- --api.dashboard=true
- --api.debug=true
- --api.disabledashboardad=true
- --api.insecure=true
- --providers.swarm=true
- --providers.swarm.endpoint=unix:///var/run/docker.sock
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --providers.docker.network=trf-{{ clustername }}
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.file.directory=/etc/traefik/dynamic_conf
- --entryPoints.http.address=:80
- --entrypoints.http.http.redirections.entryPoint.to=https
- --entrypoints.http.http.redirections.entryPoint.scheme=https
- --entrypoints.http.http.redirections.entrypoint.permanent=true
- --entrypoints.https.address=:443
- --serverstransport.insecureskipverify=true
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 443
published: 443
protocol: tcp
mode: host
volumes:
- /etc/localtime:/etc/localtime:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- /mnt/traefik/data/certconf.yml:/etc/traefik/dynamic_conf/conf.yml:ro
- /mnt/traefik/certs:/certs
- /mnt/traefik/logs/:/tmp/
deploy:
mode: global
placement:
constraints:
- node.role==manager
restart_policy:
condition: on-failure
labels:
- "traefik.enable=true"
- "traefik.http.routers.traefik.rule=Host({{ traefik_fqdn }})"
- "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=https"
- "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"
- "traefik.http.routers.traefik.middlewares=traefik-https-redirect"
- "traefik.http.routers.traefik-secure.entrypoints=https"
- "traefik.http.routers.traefik-secure.rule=Host({{ traefik_fqdn }})"
- "traefik.http.routers.traefik-secure.tls=true"
- "traefik.http.routers.traefik-secure.service=api@internal"
- "traefik.http.routers.traefik.service=dashboard"
- "traefik.http.services.dashboard.loadbalancer.server.port=8080"

whoami:
image: "traefik/whoami"
networks:
trf-{{ clustername }}:
deploy:
restart_policy:
condition: on-failure
labels:
- "traefik.enable=true"
- "traefik.http.routers.whoami.rule=Host(whoami.xcorp.net)"
- "traefik.http.routers.whoami.entrypoints=https"
- "traefik.http.routers.whoami.tls=true"
- "traefik.http.services.whoami-service.loadbalancer.server.port=80"

networks:
trf-{{ clustername }}:
external: true
`

You don’t need to deploy Traefik globally.

If you remove mode: host, then an ingress network will be created. That means all nodes open the port and sent traffic to the replicas set up, could be a single one.

ports:
  - target: 80
    published: 80
    protocol: tcp
    mode: host
1 Like