Docker swarm multi manager with traefik

twisterrm · March 24, 2023, 1:18pm

Hi all,
I have a 5 node swarm (3 managers and 2 worker), managers use a vip address with keepalived that insist on the first master, if it goes down jump to the second etc... all the stuff about entrypoint are made by traefik.
I'm quite new to this kind of infrastructure and maybe my request is silly.
If a schedule a service (let's say grafana) on the first master everything work fine, if i schedule grafana on the second/third master i have a 504 gateway timeout error.
Could be a traefik bad configuration?
If i go directly to the ip of node i could reach grafana so it seems that is a traefik issue.
No firewall between nodes, atm i don't need tls/https/certs

Docker compose grafana

version: '3'
services:
  prometheus:
    image: prom/prometheus:latest
    deploy:
      labels:
        - "traefik.enable=false"
    networks:
      - monitoring
    ports:
      - 9090:9090
    command:
      - --config.file=/etc/prometheus/prometheus.yml
      - --storage.tsdb.path=/prometheus
      - --web.console.libraries=/usr/share/prometheus/console_libraries
      - --web.console.templates=/usr/share/prometheus/consoles
    volumes:
      - /mnt/docker_data/volumes/monitoring/prometheus:/etc/prometheus/
      - /mnt/docker_data/volumes/monitoring/prometheus_data:/prometheus
  grafana:
    image: grafana/grafana
    ports:
      - 3000:3000
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=xxxxxxxxxxxxxxxxxxxx
      - GF_SERVER_DOMAIN=monitoring.example.local
    depends_on:
      - prometheus
    volumes:
      - /mnt/docker_data/volumes/monitoring/grafana-storage:/var/lib/grafana
    networks:
      - traefik_public
      - monitoring
    deploy:
      placement:
        constraints: [node.role == manager]
      labels:
        - "traefik.enable=true"
        - "traefik.docker.network=traefik_public"
        - "traefik.http.routers.grafana_ict.rule=Host(`monitoring.example.local`)"
        - "traefik.port=3000"
        - "traefik.http.routers.grafana_ict.service=grafana_ict"
        - "traefik.http.routers.grafana_ict.entrypoints=http"
        - "traefik.http.services.grafana_ict.loadbalancer.server.port=3000"

networks:
  traefik_public:
    external: true
  monitoring:
    external: true

If i inspect the container i see the assigned ip for traefik_public and is the same i see into traefik dashboard as load balanced ip/port.
Traefik log report only

time="2023-03-24T11:29:23Z" level=debug msg="'504 Gateway Timeout' caused by: dial tcp 10.0.5.141:3000: i/o timeout"

10.0.5.141 is the ip assigned on master2

I'm expecting that traefik will route traffic despite the master node is not the 1st node.
I cannot figure if is my configuration error or if treafink dosen't work in this way.
Thanks in advance

bluepuma77 · March 24, 2023, 1:25pm

Tell Traefik which docker.network to use to forward requests to the service. Your service has multiple but Traefik probably just shares one with it. You can set that globally in provider.docker or per service with labels.

twisterrm · March 24, 2023, 3:09pm

I already specified in grafana container, in prometheus i don't need to expose via traefik for now, it don't work sadly

bluepuma77 · March 24, 2023, 5:40pm

Did you define it as overlay network? If using compose to create the network, did you give it a name (otherwise the name may be extended by compose)?

PS: we run multi-manager Docker Swarm with Traefik and Grafana, works like a charme, so it is possible.

twisterrm · March 24, 2023, 5:55pm

Yes, it is an overlay network created by docker cli, I'm fighting this is sue since months

twisterrm · March 27, 2023, 9:40am

Some steps ahead, now grafana work as intended, i just need to connect prometheus, I want not to pass by http endpoint for answer to http://monitoring.example.local but forcing it to http://monitoring.example.local:9090, how can i do? i try with

Version: '3'
services:
  prometheus:
    image: prom/prometheus:latest
    deploy:
      labels:
        - "traefik.enable=false"
        - "traefik.docker.network=traefik_public"
        - "traefik.http.routers.prometheus_ict.rule=Host(`monitoring-example.local`)"
        - "traefik.http.services.prometheus_ict.loadbalancer.server.port=9090"
        - "traefik.http.routers.prometheus_ict.service=prometheus_ict"
        - "traefik.port=9090"
    networks:
      - monitoring
      - traefik_public
    command:
      - --config.file=/etc/prometheus/prometheus.yml
      - --storage.tsdb.path=/prometheus
      - --web.console.libraries=/usr/share/prometheus/console_libraries
      - --web.console.templates=/usr/share/prometheus/consoles
    volumes:
      - /mnt/docker_data/volumes/monitoring/prometheus:/etc/prometheus/
      - /mnt/docker_data/volumes/monitoring/prometheus_data:/prometheus
  grafana:
    image: grafana/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=xxxxxxxxxxxxxxxxxxxx
      - GF_SERVER_DOMAIN=monitoring-example.local
    depends_on:
      - prometheus
    volumes:
      - /mnt/docker_data/volumes/monitoring/grafana-storage:/var/lib/grafana
    networks:
      - traefik_public
      - monitoring
    deploy:
      placement:
        constraints: [node.role == manager]
      labels:
        - "traefik.enable=true"
        - "traefik.docker.network=traefik_public"
        - "traefik.http.routers.grafana_ict.rule=Host(`monitoring.example.local`)"
        - "traefik.port=3000"
        - "traefik.http.routers.grafana_ict.service=grafana_ict"
        - "traefik.http.routers.grafana_ict.entrypoints=http"
        - "traefik.http.services.grafana_ict.loadbalancer.server.port=3000"

networks:
  traefik_public:
    external: true
  monitoring:
    external: true

But everything stop working: 504 from grafana on monitoring.example.local (even if untouched) and 404 on monitoring.example.local:9090.
I know the issue is just on traefik conf but i don't know how to solve it

bluepuma77 · March 27, 2023, 11:34am

You are using Host(`monitoring.example.local`) and Host(`monitoring-example.local`), is that on purpose?

You want Grafana to use Traefik to connect to Prometheus? On external (Traefik) port 9000? Then you need to enable a Traefik entrypoint on port 9000 and let Prometheus use it.

Or you can let Grafana connect to Prometheus using the Docker network monitoring (without Traefik), the Docker DNS should resolve prometheus to the internal IP.

twisterrm · March 27, 2023, 11:49am

The host has to be intended the same, i made an error redacting the original website in compose.
I solved the issue using internal "monitoring" network, the desired behaviour should be
monitoring.example.com - grafana
monitoring.example.com:9090 - prometheus
I will add an entrypoint and start testing around, thank you

Topic		Replies	Views
504 Gateway Timeout - Docker Swarm + Plex Traefik v1 docker-swarm	1	1621	March 6, 2020
Error 504 with Traefik on Docker Swarm - again Traefik v2 docker , docker-swarm	10	605	January 4, 2024
Gateway 504 error with services Traefik v2 docker-swarm	12	6489	May 16, 2022
Traefik gateway timeout for every service Traefik v2 docker-swarm	2	4333	January 8, 2022
Routing issues with prometheus + docker + traefik Traefik v2 docker	2	2942	December 18, 2019

Docker swarm multi manager with traefik

Related topics