[SOLVED] Traefik 3 not working with --providers.swarm.endpoint=tcp://127.0.0.1:2377

I struggle to have basic run of traefik v3.0 (I have no problems with v2.11):

$ docker run --rm -it --name traefik_v3_test --network web \
  -v /var/run/docker.sock:/var/run/docker.sockx \
  traefik:v3.0 \
  --providers.swarm.endpoint=tcp://127.0.0.1:2377
2024-05-06T05:52:09Z ERR Failed to retrieve information of the docker client and server host error="Cannot connect to the Docker daemon at tcp://127.0.0.1:2377. Is the docker daemon running?" providerName=swarm
2024-05-06T05:52:09Z ERR Provider error, retrying in 305.622334ms error="Cannot connect to the Docker daemon at tcp://127.0.0.1:2377. Is the docker daemon running?" providerName=swarm
2024-05-06T05:52:10Z ERR Failed to retrieve information of the docker client and server host error="Cannot connect to the Docker daemon at tcp://127.0.0.1:2377. Is the docker daemon running?" providerName=swarm

UPDATE: Solved thanks to @bluepuma77 . Instead of nonsense tcp://127.0.0.1:2377 use unix:///var/run/docker.sock (must be mounted).

Share your full Traefik static and dynamic config, and docker-compose.yml if used.

I tried to isolate the problem so that it is easy to reproduce - I can show that it works for 2.11:

docker run --rm -it --name traefik_v211_test --network web \
  -v /var/run/docker.sock:/var/run/docker.sock \
  traefik:v2.11 \
  --providers.docker=true \
  --providers.docker.swarmMode=true \
  --providers.docker.exposedByDefault=false

I assume the problem is somewhere how docker is started on ubuntu from systemd - not listening on tcp:

/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

I tried to change this, but it is not easy, I do not want to have it listening on 0.0.0.0:2377 and have feeling that they chose incorrect way, how to connect to the docker swarm API, I myself connect this way:

docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock alpine:latest \
  sh -c 'apk add curl && curl -s --unix-socket /var/run/docker.sock http://v1.24/version'

This does not require reconfiguring docker (which requires maintenance window if you have only one manager) and does not have any security implications. I'm not sure how can tcp://127.0.0.1:2377 in docker work anyways:

$ docker exec -it 63a84b39a320 sh # connecting to running container with v3
/ # apk add nmap
/ # nmap -p 2377 127.0.0.1 172.17.0.1
Starting Nmap 7.94 ( https://nmap.org ) at 2024-05-06 09:17 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000074s latency).

PORT     STATE  SERVICE
2377/tcp closed swarm

Nmap scan report for cb-mng1 (172.17.0.1)
Host is up (0.00013s latency).

PORT     STATE SERVICE
2377/tcp open  swarm

You have to connect to the host machine, not localhost.

You mount the Docker socket (/var/run/docker.sock) into the container, but then try to use TCP (tcp://127.0.0.1:2377) to access it, that doesn't make sense. Check simple Traefik example.

Thanks for your docker compose, but you are not running trefik in swarm mode - and I need it in swarm mode, with v2.11 it was --providers.docker.swarmMode=true but with v3.0 it should be --providers.swarm.endpoint=tcp://127.0.0.1:2377, which does not work.

$ docker run --rm -it --network web -v /var/run/docker.sock:/var/run/docker.sock traefik:v3.0 --providers.docker.swarmMode=true
{"level":"error","loader":"FLAG","time":"2024-05-06T10:36:51Z","message":"Docker provider `swarmMode` option has been removed in v3, please use the Swarm Provider instead.For more information please read the migration guide: https://doc.traefik.io/traefik/v3.0/migration/v2-to-v3/#docker-docker-swarm"}
{"level":"error","error":"command traefik error: incompatible deprecated static option found","time":"2024-05-06T10:36:51Z","message":"Command error"}

I have no idea how they want to talk from container to tcp://127.0.0.1:2377 in other way then via a unix socket, but mounting /var/run/docker.sock does not make any difference. I believe this instruction must be for traefik which is not running in container.

The doc has a really bad example, that is never going to be used in real life :rofl:

Try this

--providers.swarm=true

or

providers:
  swarm:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false

For testing:

docker run --rm -it \
  -p 8080:8080 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  traefik:v3.0 \
    --providers.swarm=true \
    --api.dashboard=true \
    --api.insecure=true \
    --log.level=DEBUG
1 Like

Cool man! Thank you very much! I was trying to change url to use socket url, but was not sure how exactly.

This is the second time you helped me, I think I owe you a beer. The first time I let myself inspired by your proof-of-concept and created GitHub - brablc/swarm-certbot-traefik .

I created a Github issue to improve the documentation, it should help newbies to get started.

@brablc Note that your swarm-certbot-traefik repo seems to use volumes to share the LE TLS certs. Those are local to every node, so you need to ensure the certs are distributed to all Traefik nodes. Did I mention my syncthing PoC :wink:

We currently just use ansible to create/renew the certs with certbot dnsChallenge and upload to Traefik servers into a mounted directory. Not elegant, but works without restarting Traefik (opposed to using configs/secrets).

1 Like

I do not have external load balancer at the moment and manager node is SPOF anyways, so my both instances of traefik are on this one node - just to allow rolling update. I hope to remember syncthing once I expand to more manager nodes :wink:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.