Why does traffic coming in from Tailscale appear to be coming from the Docker network when using ipwhitelist?

Hey, I have traefik v2.10.1 running in a docker container in Swarm mode. I'm trying to set up an ipwhitelist that will only allow traffic from a specific ip range to access certain services. I have an interface tailscale0 which will have the ip range of 100.64.0.0/10. I can see traffic coming into tailscale0 via tcpdump from my tailscale network, but when I go to filter in Traefik it always appears to be coming from 172.18.0.1 which is my docker_gwbridge interface. I am also unable to use ipwhitelist.ipstrategy.depth=0 as it always produces an empty ip. I have forwardedHeaders.insecure set to true as well.

Why is that? I guess the whitelist happens after Traefik has consumed the packets, but I still don't understand why their source ip would be the bridge gateway ip... Even so, why are we unable to view the original ip within the X-Forwarded-* headers? More interested in exploring this issue and finding a root cause than expediting a solution, I believe I could allow internal traffic by whitelisting the gateway, but I'm super curious to know what is happening here!

I have this all setup in a VM and system is setup and reproducible with ansible, so very controlled env.

version: "3.9"
services:
  traefik:
    image: "traefik:v2.10.1"
    networks:
      - traefik
    deploy:
      placement:
        constraints:
          - "node.role==manager"
      labels:
        traefik.enable: "true"
        # Global redirection: http to https
        traefik.http.routers.http-catchall.rule: HostRegexp(`{host:(www\.)?.+}`)
        traefik.http.routers.http-catchall.entrypoints: web
        traefik.http.routers.http-catchall.middlewares: wwwtohttps
        # Global redirection: https (www.) to https
        traefik.http.routers.wwwsecure-catchall.rule: HostRegexp(`{host:(www\.).+}`)
        traefik.http.routers.wwwsecure-catchall.entrypoints: websecure
        traefik.http.routers.wwwsecure-catchall.tls: "true"
        traefik.http.routers.wwwsecure-catchall.middlewares: wwwtohttps
        # middleware: http(s)://(www.) to  https://
        traefik.http.middlewares.wwwtohttps.redirectregex.regex: ^https?://(?:www\.)?(.+)
        traefik.http.middlewares.wwwtohttps.redirectregex.replacement: https://$${1}
        traefik.http.middlewares.wwwtohttps.redirectregex.permanent: "true"
        # middleware private
        traefik.http.middlewares.private.ipwhitelist.sourcerange: 100.64.0.0/10
        # traefik.http.middlewares.private.ipwhitelist.ipstrategy.depth: 1
        # Dashboard router
        traefik.enable: "true"
        traefik.http.routers.api.rule: Host(`traefik.[].com`)
        traefik.http.routers.api.service: api@internal
        traefik.http.routers.api.entryPoints: websecure
        traefik.http.services.api.loadbalancer.server.port: 8080
        traefik.http.routers.api.tls.certresolver: letsencrypt
        traefik.http.routers.api.middlewares: private
    command:
      - "--log.level=TRACE"
      # Allow dashboard
      - "--api.insecure=true"
      - "--api.dashboard=true"
      # Docker setup
      - "--providers.docker"
      - "--providers.docker.swarmMode"
      - "--providers.docker.network=traefik"
      - "--providers.docker.exposedbydefault=false"
      # Setup LetsEncrypt
      - "--certificatesresolvers.letsencrypt.acme.email=[]@gmail.com"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
      - "--certificatesresolvers.letsencrypt.acme.dnschallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.dnschallenge.provider=route53"
      # Set up an insecure listener that redirects all traffic to TLS
      - "--entrypoints.web.address=:80"
      - "--entrypoints.web.forwardedHeaders.insecure=true"
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--entrypoints.websecure.address=:443"
      # Set up the TLS configuration for our websecure listener
      - "--entrypoints.websecure.http.tls=true"
      - "--entrypoints.websecure.forwardedHeaders.insecure=true"
      - "--entrypoints.websecure.http.tls.certResolver=letsencrypt"
      - "--entrypoints.websecure.http.tls.domains[0].main=[].com"
      - "--entrypoints.websecure.http.tls.domains[0].sans=*.[].com"
      # Buffer limits
      - "traefik.http.middlewares.limit.buffering.maxRequestBodyBytes=5000000"
    environment:
      - AWS_CONFIG_FILE=/run/secrets/traefik_aws_credentials_file
      - AWS_SHARED_CREDENTIALS_FILE=/run/secrets/traefik_aws_credentials_file
      - AWS_REGION=us-west-2
      - AWS_HOSTED_ZONE_ID=/run/secrets/traefik_aws_hosted_zone_id
    secrets:
      - "traefik_aws_credentials_file"
      - "traefik_aws_hosted_zone_id"
    ports:
      - target: 443
        published: 443
        protocol: tcp
        mode: host
      - target: 80
        published: 80
        protocol: tcp
        mode: host
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "letsencrypt:/letsencrypt"

volumes:
  letsencrypt:

secrets:
  traefik_aws_credentials_file:
    external: true
  traefik_aws_hosted_zone_id:
    external: true

networks:
  traefik:
    external: true

Here is some debugging info. 100.78.172.82 is my laptop and 100.121.249.125 is the host with Traefik.

Traffic coming in from Tailscale:

root@debian:~# tcpdump -v -n -l -i tailscale0 |  grep 443
100.78.172.82.58131 > 100.121.249.125.443: Flags [S], cksum 0xbc93 (correct), seq 1374341268, win 65535, options [mss 1240,nop,wscale 6,nop,nop,TS val 2923519773 ecr 0,sackOK,eol], length 0
100.121.249.125.443 > 100.78.172.82.58131: Flags [S.], cksum 0x6ec6 (incorrect -> 0xb507), seq 1275678472, ack 1374341269, win 65160, options [mss 1460,sackOK,TS val 2104950929 ecr 2923519773,nop,wscale 7], length 0
100.78.172.82.58131 > 100.121.249.125.443: Flags [.], cksum 0xda55 (correct), ack 1, win 2053, options [nop,nop,TS val 2923519775 ecr 2104950929], length 0
100.78.172.82.58131 > 100.121.249.125.443: Flags [P.], cksum 0x7363 (correct), seq 1:993, ack 1, win 2053, options [nop,nop,TS val 2923519775 ecr 2104950929], length 992

Traffic hitting the docker_gwbridge interface:

root@debian:~# tcpdump -v -n -l -i docker_gwbridge |  grep 443
172.18.0.1.58179 > 172.18.0.3.443: Flags [P.], cksum 0x17f4 (correct), seq 260527606:260527645, ack 2374749685, win 2048, options [nop,nop,TS val 3381343070 ecr 2105039223], length 39
172.18.0.1.58179 > 172.18.0.3.443: Flags [P.], cksum 0x607a (correct), seq 39:63, ack 1, win 2048, options [nop,nop,TS val 3381343070 ecr 2105039223], length 24
172.18.0.1.58179 > 172.18.0.3.443: Flags [F.], cksum 0x9c7d (correct), seq 63, ack 1, win 2048, options [nop,nop,TS val 3381343070 ecr 2105039223], length 0
172.18.0.3.443 > 100.78.172.82.58179: Flags [.], cksum 0xbcdc (incorrect -> 0x1826), ack 260527670, win 501, options [nop,nop,TS val 2105048908 ecr 3381343070], length 0
172.18.0.3.443 > 100.78.172.82.58179: Flags [P.], cksum 0xbcf4 (incorrect -> 0xbcb1), seq 0:24, ack 1, win 501, options [nop,nop,TS val 2105048908 ecr 3381343070], length 24
172.18.0.3.443 > 100.78.172.82.58179: Flags [F.], cksum 0xbcdc (incorrect -> 0x180d), seq 24, ack 1, win 501, options [nop,nop,TS val 2105048908 ecr 3381343070], length 0

Logs from Traefik:

time="2024-08-24T02:45:56Z" level=debug msg="Rejecting IP 172.18.0.1: \"172.18.0.1\" matched none of the trusted IPs" middlewareName=private@docker middlewareType=IPWhiteLister
time="2024-08-24T02:45:57Z" level=debug msg="Rejecting IP 172.18.0.1: \"172.18.0.1\" matched none of the trusted IPs" middlewareType=IPWhiteLister middlewareName=private@docker

When I uncomment the depth line I get:

time="2024-08-24T02:44:56Z" level=debug msg="Rejecting IP : empty IP address" middlewareName=private@docker middlewareType=IPWhiteLister

Looks like I found a solution. By adding the following lines, I am now able to see the proper source ip of the traffic.

# ...
- "--entrypoints.web.proxyProtocol.trustedIPs=127.0.0.1/32,100.64.0.0/10"
- "--entrypoints.websecure.proxyProtocol.trustedIPs=127.0.0.1/32,100.64.0.0/10"
# ...

Anyone have any more info on this? How would ipwhitelist ever work properly without these lines?

A TCP connection always has the last active device (like a gateway, load balancer or proxy) as source address.

To communicate the original IP, usually http headers are used. To not enable someone to submit fake IPs, you set the IPs of trusted sources.

For plain TCP connections (without http), the proxyProtoxol setting can be used. It needs to be enabled on both sender and receiver. It prefixes every TCP connection stream with the origin IP. You need to set trusted sources.

I'm actually not sure about the results I got from the "solution" I posted above... they consistently only work on a clean installation BEFORE restarting the vm.

Here is some progress with debugging. I am now running just the following, a debug container that echos http requests. I think this concisely represents the issue I am having.

services:
  echo:
    image: "mendhak/http-https-echo:31"
    networks:
      - traefik
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.echo.rule=Host(`echo.mydomain.com`)"
      - "traefik.http.routers.echo.entrypoints=websecure"
      - "traefik.http.routers.echo.tls.certresolver=letsencrypt"
      - "traefik.http.services.echo.loadbalancer.server.port=8080"
    ports:
      - target: 8080
        published: 8080
        protocol: tcp
        mode: host

So when I hit the server from my laptop via it's external ip, I get the following:

macbook-air$ curl http://[SERVER-IP]:8080/
...
ip	"::ffff:[MY-RESIDENTIAL-IP]"
...

And when I hit it via Tailscale:

macbook-air$ curl http://100.118.183.63:8080/
...
  "ip": "::ffff:172.18.0.1",
...

So the http request does still have my original source ip by the time it gets to the service when using the external ip (when traffic comes in on eth0). So why does hitting it via tailscale (tailscale0 interface) suddenly produce a request with source ip of the docker bridge gateway?? It produces the same results as hitting the server via localhost.

Here is what is really confusing to me. Here is the packet trace of the latter request going through TS:

server$ tcpdump -nn -i any 'not port 22 and not arp'
...
00:07:15.663367 tailscale0 In  IP 100.78.172.82.52285 > 100.65.43.15.8080: Flags [SEW], seq 1640425975, win 65535, options [mss 1240,nop,wscale 6,nop,nop,TS val 2562139288 ecr 0,sackOK,eol], length 0
00:07:15.663400 docker_gwbridge Out IP 172.18.0.1.52285 > 172.18.0.3.8080: Flags [SEW], seq 1640425975, win 65535, options [mss 1240,nop,wscale 6,nop,nop,TS val 2562139288 ecr 0,sackOK,eol], length 0

How does the traffic suddenly get usurped by docker_gwbridge?

For Docker networking questions, you can try https://forums.docker.com/

Good call, this has gotten off topic. Will do.