Getting bad gateway intermitently when routing to portainer through traefik

I am having some really weird behaviour with traefik in exposing portainer application .

Initially I set up portainer on my network and accessed using ip and port 9443. I set it up on a VM in docker swarm mode. I then set traefik from the portainer UI using the below yaml

version: '3.2'

services:

  traefik:
    image: traefik:latest
    environment:
      ACME_DNS_API_BASE: https://auth.acme-dns.io
      ACME_DNS_STORAGE_PATH: /acme-dns.json
    command:
      - --entrypoints.web.http.redirections.entryPoint.to=websecure
      - --entrypoints.web.http.redirections.entryPoint.scheme=https
      - --entrypoints.web.http.redirections.entrypoint.permanent=true
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --entrypoints.websecure.http.tls.certResolver=leresolver
      - --entrypoints.websecure.http.tls.domains[0].main=mydomain.com
      - --entrypoints.websecure.http.tls.domains[0].sans=*.mydomain.com
      - --providers.docker=true
      - --providers.swarm.endpoint=unix:///var/run/docker.sock
      - --providers.docker.exposedbydefault=false
      - --providers.docker.network=public
      - --api=true
      - --log.level=DEBUG
      - --certificatesResolvers.leresolver.acme.email=me@mydomain.com
      - --certificatesResolvers.leresolver.acme.storage=/acme.json
      - --certificatesResolvers.leresolver.acme.keyType=EC384
      - --certificatesResolvers.leresolver.acme.dnsChallenge=true
      - --certificatesResolvers.leresolver.acme.dnsChallenge.provider=acme-dns
      - --certificatesResolvers.leresolver.acme.dnsChallenge.resolvers=1.1.1.1:53,8.8.8.8:53
    ports:
      - "80:80"
      - "443:443"
    networks:
      - public
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /obtools-drive/traefik_data/acme.json:/acme.json
      - /obtools-drive/traefik_data/acme-dns.json:/acme-dns.json

    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [ node.role == manager ]
      labels:
        - "traefik.enable=true"
        #dashboard
        - "traefik.http.routers.dashboard.rule=Host(`traefik.mydomain.com`)"
        - "traefik.http.routers.dashboard.entrypoints=websecure"
        - "traefik.http.routers.dashboard.service=dashboard@internal"
        - "traefik.http.routers.dashboard.middlewares=traefik-auth"
        #api
        - "traefik.http.routers.api.rule=Host(`traefik.mydomain.com`) && PathPrefix(`/api`)"
        - "traefik.http.routers.api.entrypoints=websecure"
        - "traefik.http.routers.api.service=api@internal"
        - "traefik.http.routers.api.middlewares=traefik-auth"
        #general
        - "traefik.http.services.traefik.loadbalancer.server.port=80"
        - "traefik.http.middlewares.traefik-auth.basicauth.users=admin:blabla"

networks:
  public:
    external: true

Traefik came up and then I deployed another application to route through traefik using the certs that were mounted to traefik. Everything is working perfectly for this application and traefik UI itself is accessible.

However I then recreated my portainer docker stack to also route through traefik and now I am running into issues. Mostly I get a Bad Gateway or Gateway Timeour response but sometimes it loads the login page. Once I was even able to login over https but then when trying to control the stack it logged out.

My portainer yaml is below

version: '3.2'

services:
  agent:
    image: portainer/agent:2.21.4
    environment:
      # REQUIRED: Should be equal to the service name prefixed by "tasks." when
      # deployed inside an overlay network
      AGENT_CLUSTER_ADDR: tasks.agent
      # AGENT_PORT: 9001
      # LOG_LEVEL: debug
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/lib/docker/volumes:/var/lib/docker/volumes
    networks:
      - agent_network
    deploy:
      mode: global
      placement:
        constraints: [node.platform.os == linux]

  portainer:
    image: portainer/portainer-ce:2.21.4
    command: -H tcp://tasks.agent:9001 --tlsskipverify
    ports:
      - "9000:9000"
      - "9443:9443"
    volumes:
      - portainer_data:/data
    networks:
      - public
      - agent_network
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [node.role == manager]
      labels:
        # Frontend
        - "traefik.enable=true"
        - "traefik.http.routers.frontend.rule=Host(`portainer.mydomain.com`)"
        - "traefik.http.routers.frontend.entrypoints=websecure"
        - "traefik.http.services.frontend.loadbalancer.server.port=9000"
        - "traefik.http.routers.frontend.service=frontend"
        # Edge
        - "traefik.http.routers.edge.rule=Host(`edge.mydomain.com`)"
        - "traefik.http.routers.edge.entrypoints=websecure"
        - "traefik.http.services.edge.loadbalancer.server.port=8000"
        - "traefik.http.routers.edge.service=edge"

networks:
  public:
    external: true
  agent_network:
    external: true

volumes:
  portainer_data:
    driver_opts:
      type: none
      device: /obtools-drive/portainer_data
      o: bind

The public network is the same used by the other application and traefik for access . The agent-network I think is for the portainer UI container to talk to the portainer agent container. It is external and device type is overlay

NETWORK ID     NAME              DRIVER    SCOPE
x9feja8n8h01   agent_network       overlay   swarm
1bc4265eab70   bridge                   bridge    local
173e77d0a1f4   docker_gwbridge   bridge    local
76ed02c70011   host                       host      local
9zr6pb907cq6   ingress                   overlay   swarm
b32ec228c49d   none                      null      local
czbg8h3kqshy   public                      overlay   swarm

Portainer is accessible still via IP when I expose port 9443 in the yaml and recreate the stack.

Make everything providers.swarm

1 Like

That is it! working perfectly now, thanks so much. I didnt even notice that :grinning:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.