Worker service won't load unless IP points to master

Hi all,
I've just migrated from nginx-proxy. So far everything is going great. I followed this guide: http://dockerswarm.rocks/

  1. Master Node running traefik and swarmpit (Domains resolve and load the interfaces perfectly)
  2. Single worker node to run a php website. Currently I am testing the worker with the traefik/whoami container to confirm it's all working.
  3. ports open on both MASTER and WORKER: 80, 443, 2377, 4789 and 7946

I have confirmed that whoami is running on the Worker node via swarmpit.

PROBLEM: If I point who.mydomain.com to Master IP then whoami output loads perfect. Presumably Traefik is routing the traffic through to the worker node and back. Is that correct?

HOWEVER: If I point who.mydomain.com to the Worker IP the I get nothing. ERR_CONNECTION_REFUSED.

According to the Docker Swarm docs, any node can have a domain resolved and mesh will route accordingly. In this instance I simple want to point the IP of a domain to the worker node and have it resolve.

I am stuck on this issue for a whoami service. Here is my compose for the stack. It runs perfect if i point the IP to master and have it deployed to master or worker but not if i point to worker and deploy it to worker.

whoami.yml

version: '3.3'
services:
 new3:
    image: traefik/whoami
    environment:
      REAL_IP_FROM: 10.0.1.0/24
      REAL_IP_HEADER: '1'
    volumes:
     - /docker/data/www:/var/www/html
    networks:
     - net
     - traefik-public
    logging:
      driver: journald
    deploy:
      labels:
        traefik.http.routers.en3-https.rule: Host(`who.mydomain.com`)
        traefik.http.routers.en3-http.middlewares: https-redirect
        traefik.http.routers.en3-https.tls.certresolver: le
        traefik.constraint-label: traefik-public
        traefik.http.routers.en3-http.entrypoints: http
        traefik.http.services.en3.loadbalancer.server.port: '80'
        traefik.http.routers.en3-http.rule: Host(`who.mydomain.com`)
        traefik.http.routers.en3-https.entrypoints: https
        traefik.http.routers.en3-https.tls: 'true'
        traefik.docker.network: traefik-public
        traefik.enable: 'true'
      placement:
        constraints:
         - node.hostname == wwwprod
      resources:
        reservations:
          cpus: '0.5'
          memory: 512M
        limits:
          cpus: '1.0'
          memory: 1024M
networks:
  net:
    driver: overlay
  traefik-public:
    external: true

traefik.yml

version: '3.3'

services:

  traefik:
    # Use the latest Traefik image
    image: traefik:v2.2
    ports:
      # Listen on port 80, default for HTTP, necessary to redirect to HTTPS
      - target: 80
        published: 80
        mode: host
      # Listen on port 443, default for HTTPS
      - target: 443
        published: 443
        mode: host
    deploy:
      placement:
        constraints:
          # Make the traefik service run only on the node with this label
          # as the node with it has the volume for the certificates
          - node.labels.traefik-public == true
      labels:
        # Enable Traefik for this service, to make it available in the public network
        - traefik.enable=true
        # Use the traefik-public network (declared below)
        - traefik.docker.network=traefik-public
        # Use the custom label "traefik.constraint-label=traefik-public"
        # This public Traefik will only use services with this label
        # That way you can add other internal Traefik instances per stack if needed
        - traefik.constraint-label=traefik-public
        # admin-auth middleware with HTTP Basic auth
        # Using the environment variables USERNAME and HASHED_PASSWORD
        - traefik.http.middlewares.admin-auth.basicauth.users=stolenadmin:${HASHED_PASSWORD?Variable not set}
        # https-redirect middleware to redirect HTTP to HTTPS
        # It can be re-used by other stacks in other Docker Compose files
        - traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
        - traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
        # traefik-http set up only to use the middleware to redirect to https
        # Uses the environment variable DOMAIN
        - traefik.http.routers.traefik-public-http.rule=Host(`traefik.mydomain.com`)
        - traefik.http.routers.traefik-public-http.entrypoints=http
        - traefik.http.routers.traefik-public-http.middlewares=https-redirect
        # traefik-https the actual router using HTTPS
        # Uses the environment variable DOMAIN
        - traefik.http.routers.traefik-public-https.rule=Host(`traefik.mydomain.com`)
        - traefik.http.routers.traefik-public-https.entrypoints=https
        - traefik.http.routers.traefik-public-https.tls=true
        # Use the special Traefik service api@internal with the web UI/Dashboard
        - traefik.http.routers.traefik-public-https.service=api@internal
        # Use the "le" (Let's Encrypt) resolver created below
        - traefik.http.routers.traefik-public-https.tls.certresolver=le
        # Enable HTTP Basic auth, using the middleware created above
        - traefik.http.routers.traefik-public-https.middlewares=admin-auth
        # Define the port inside of the Docker service to use
        - traefik.http.services.traefik-public.loadbalancer.server.port=8080
    volumes:
      # Add Docker as a mounted volume, so that Traefik can read the labels of other services
      - /var/run/docker.sock:/var/run/docker.sock:ro
      # Mount the volume to store the certificates
      - /docker/data/traefik/certs:/certificates
    command:
      # Enable Docker in Traefik, so that it reads labels from Docker services
      - --providers.docker
      # Add a constraint to only use services with the label "traefik.constraint-label=traefik-public"
      - --providers.docker.constraints=Label(`traefik.constraint-label`, `traefik-public`)
      # Do not expose all Docker services, only the ones explicitly exposed
      - --providers.docker.exposedbydefault=false
      # Enable Docker Swarm mode
      - --providers.docker.swarmmode=true
      # Create an entrypoint "http" listening on address 80
      - --entrypoints.http.address=:80
      # Create an entrypoint "https" listening on address 443
      - --entrypoints.https.address=:443
      # Create the certificate resolver "le" for Let's Encrypt, uses the environment variable EMAIL
      - --certificatesresolvers.le.acme.email=contact@domain.com
      # Store the Let's Encrypt certificates in the mounted volume
      - --certificatesresolvers.le.acme.storage=/certificates/acme.json
      # Use the TLS Challenge for Let's Encrypt
      - --certificatesresolvers.le.acme.tlschallenge=true
      # Enable the access log, with HTTP requests
      - --accesslog
      # Enable the Traefik log, for configurations and errors
      - --log
      # Enable the Dashboard and API
      - --api
    networks:
      # Use the public network created to be shared between Traefik and
      # any other service that needs to be publicly available with HTTPS
      - traefik-public

networks:
  # Use the previously created public network "traefik-public", shared with other
  # services that need to be publicly available via this Traefik
  traefik-public:
    external: true

Thanks!!

What is the traefik service definition?

I updated the original post with those details.

As traefik is using host networking, this is expected as you're not using the routing mesh of swarm.

I assume you're using host networking to get the client ip's. You have to use host networking to get that. You can put a load balancer or GTM in front of your nodes. I do this with TraefikEE on swarm.

That is a very astute analysis @cakiwi - thanks Sir!

You are 100% correct. We need REAL_IP_FROM to work.

As per your suggestions: GTM is not possible on our budget so I will investigate a LoadBalancer. I have previously used MetalLB with success but it's not compatible with docker swarm. Any suggestions for a thin/fast/traefik compatable LB?

Is that the only solution to getting REAL_IP and how would it pass through traffic to the containers? It seems that may be the next issue???

If you are on a cloud platform, most will have a basic loadbalancer suitable for the task.

I've had good experience with HAProxy before and would recommend it.

They will come as X-Forwarded-For Headers(NOT an example using host networking):

X-Forwarded-For: 172.23.0.1
X-Forwarded-Host: one.localhost
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Forwarded-Server: d60fa61fee25

Thanks very much @cakiwi .

I will first try this method: GitHub - newsnowlabs/docker-ingress-routing-daemon: Docker swarm daemon that modifies ingress mesh routing to expose true client IPs to service containers

Then HAProxy. Thank-you!!!

Please report back on how it works.

Well it works perfectly. I'm just trying to get the script to run after reboot.

I will need some time tweaking this.

So here are the steps I took to get it working.

  1. Clone this onto each server
https://github.com/newsnowlabs/docker-ingress-routing-daemon
  1. Find my swarm network IP in use for master and worker nodes. They usually start from 10.0.0.2 however do a docker inspect to find out.

  2. Stop all stacks

  3. Run this script with your IP's all comma separated. Don't forget to chmod +x the script.

/docker/docker-ingress-routing-daemon --ingress-gateway-ips 10.0.0.2,10.0.0.3 --install
  1. Start all stacks and test with a whoami and you should see the desired result.

  2. If you need to uninstall the changes then do

/docker/docker-ingress-routing-daemon --uninstall
  1. Here is a uninstall service that runs before shutdown
[Unit]
Description=DIND Uninstall
Before=shutdown.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c "echo 'DIND Uninstall service - start' && sleep 5 && /docker/docker-ingress-routing-daemon --uninstall && echo 'Oneshot service - end'"
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target
  1. Here is a install service that runs after docker starts
[Unit]
Description=DIND Install
After=docker.service
Requires=docker.service

[Service]
Type=simple
Restart=always
ExecStart=/bin/bash -c "/docker/docker-ingress-routing-daemon --ingress-gateway-ips 10.0.0.2,10.0.0.3 --install"

[Install]
WantedBy=multi-user.target
  1. Here are some helpful commands to get you around.
chmod 664 /etc/systemd/system/dind-uninstall.service
chmod 664 /etc/systemd/system/dind-install.service
chmod 664 /etc/systemd/system/dind*
systemctl daemon-reload && systemctl enable dind-uninstall && systemctl enable dind-install
systemctl disable dind-install
service restart dind-install
vi /etc/systemd/system/dind-uninstall.service
vi /etc/systemd/system/dind-install.service

You may need to play around with the reboot start order. For example start the script before docker.service. Play around until you get it correct on your system.