Docker Swarm - Requests fail to reach a service on a different node

Kia ora,

I've setup a Docker Swarm with Traefik v2 as the reverse proxy, and have been able to access the dashboard with no issues.

I am having an issue where I cannot get a response from any service that runs on a different node to the node Traefik is running on. I'm been testing and researching for two days and presuming it's a network issue of some type, so would appreciate some help figuring the issue out.

I've done some quick testing with a empty Nginx image and was able to deploy another stack and get a response if the image was on the same node. Other stacks on the swarm which deploy across multiple nodes (but not including the Traefik node) are able to communicate to each other without issues).

Here is the test stack to provide some context of what I was using.

version: '3.8'

services:
    test:
        image: nginx:latest
        deploy:
            replicas: 1
            placement:
                constraints:
                    - node.role==worker
            labels:
                - "traefik.enable=true"
                - "traefik.docker.network=uccser-dev-public"
                - "traefik.http.services.test.loadbalancer.server.port=80"
                - "traefik.http.routers.test.service=test"
                - "traefik.http.routers.test.rule=Host(`TEST DOMAIN`) && PathPrefix(`/test`)"
                - "traefik.http.routers.test.entryPoints=web"
        networks:
            - uccser-dev-public

networks:
  uccser-dev-public:
    external: true

The uccser-dev-public network is an overlay network across all nodes, with no encryption.

If I added a constraint to specify the Traefik node, then the requests worked with no issues. However, if I switched it to a different node, I get the Traefik 404 page.

The Traefik dashboard is showing it sees the service:

However the access logs show the following:

proxy_traefik.1.6fbx58k4n3fj@SWARM_NODE    | IP_ADDRESS - - [21/Jul/2021:09:03:02 +0000] "GET / HTTP/2.0" - - "-" "-" 1430 "-" "-" 0ms

It's just blank, and I don't know where to proceed from here. The normal log shows no errors that I can see.

Traefik stack file:

version: '3.8'

x-default-opts:
  &default-opts
  logging:
    options:
      max-size: '1m'
      max-file: '3'

services:
  # Custom proxy to secure docker socket for Traefik
  docker-socket:
    <<: *default-opts
    image: tecnativa/docker-socket-proxy
    networks:
      - traefik-docker
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      NETWORKS: 1
      SERVICES: 1
      SWARM: 1
      TASKS: 1
    deploy:
      placement:
        constraints:
          - node.role == manager

  # Reverse proxy for handling requests
  traefik:
    <<: *default-opts
    image: traefik:2.4.11
    networks:
      - uccser-dev-public
      - traefik-docker
    volumes:
      - traefik-public-certificates:/etc/traefik/acme/
    ports:
      - target: 80 # HTTP
        published: 80
        protocol: tcp
        mode: host
      - target: 443 # HTTPS
        published: 443
        protocol: tcp
        mode: host
    command:
      # Docker
      - --providers.docker
      - --providers.docker.swarmmode
      - --providers.docker.endpoint=tcp://docker-socket:2375
      - --providers.docker.exposedByDefault=false
      - --providers.docker.network=uccser-dev-public
      - --providers.docker.watch
      - --api
      - --api.dashboard
      - --entryPoints.web.address=:80
      - --entryPoints.websecure.address=:443
      - --log.level=DEBUG
      - --global.sendAnonymousUsage=false
    deploy:
      placement:
        constraints:
            - node.role==worker
      # Dynamic Configuration
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.dashboard.rule=Host(`SWARM_NODE`) && (PathPrefix(`/api`) || PathPrefix(`/dashboard`))"
        - "traefik.http.routers.dashboard.service=api@internal"
        - "traefik.http.services.dummy-svc.loadbalancer.server.port=9999" # Dummy service for Swarm port detection. The port can be any valid integer value.

volumes:
  traefik-public-certificates: {}

networks:
  # This network is used by other services
  # to connect to the proxy.
  uccser-dev-public:
    external: true
  # This network is used for Traefik to talk to
  # the Docker socket.
  traefik-docker:
    driver: overlay
    driver_opts:
      encrypted: 'true'

Any ideas?

Thanks for your time,

Jack

Further testing showed other services were working on different nodes, so figured it must be an issue with my application. Turns out my Django application still had a bunch of settings configured for it's previous hosting location regarding HTTPS. As it wasn't passing the required settings it had denied the requests before the were processed. I needed to have the logging level for gunicorn (WSGI) lower to see more information too.

In summary, Traefik and Swarm were fine.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.