Frequent Timeout Issues with Traefik Reverse Proxy

Hi everyone,

Quick background:

I have a virtual server running Ubuntu Server 24.04.1 LTS (IP: 172.21.21.22).

I also have another virtual server running Windows Server 2019, but with a completely different IP: 10.10.180.41.

These two servers can communicate with each other.

On the Ubuntu server, I am running Docker and Traefik using the following configuration files:

docker-compose.yaml

---
services:
  traefik:
    image: traefik:latest
    container_name: traefik
    command: --api.insecure=true --providers.docker --providers.docker.network=frontend
    ports:
      - "80:80"
      - "443:443"
      - "8088:8088"
    environment:
      - CF_DNS_API_TOKEN=${CF_DNS_API_TOKEN}
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./config/traefik.yaml:/etc/traefik/traefik.yaml:ro
      - ./data/certs/:/var/traefik/certs:rw
      - ./dynamic_conf:/dynamic_conf:ro
    networks:
      - frontend
    restart: unless-stopped
networks:
  frontend:
    external: true
    name: frontend

services.yaml

http:
  routers:
    traefik-dashboard-http:
      entryPoints:
        - web
      rule: "Host(`traefik-dashboard.mydomain.com`)"
      service: api@internal
      middlewares:
        - local-network-whitelist
        - traefik-dashboard-auth
    traefik-dashboard-https:
      entryPoints:
        - websecure
      rule: "Host(`traefik-dashboard.mydomain.com`)"
      service: api@internal
      middlewares:
        - local-network-whitelist
        - traefik-dashboard-auth
      tls:
        certResolver: cloudflare

    logs-http:
      entryPoints:
        - web
      rule: "Host(`logs.mydomain.com`)"
      service: logs-service

    logs-https:
      entryPoints:
        - websecure
      rule: "Host(`logs.mydomain.com`)"
      service: logs-service
      tls:
        certResolver: cloudflare

    test-http:
      entryPoints:
        - web
      rule: "Host(`test.mydomain.com`)"
      service: test-service

    test-https:
      entryPoints:
        - websecure
      rule: "Host(`test.mydomain.com`)"
      service: test-service
      tls:
        certResolver: cloudflare

  services:
    logs-service:
      loadBalancer:
        servers:
          - url: "http://10.10.180.41:80"
        sticky:
          cookie:
            name: "traefik-session-logs"

    test-service:
      loadBalancer:
        servers:
          - url: "http://10.10.180.41:80"
        serversTransport: "myTransport"
        sticky:
          cookie:
            name: "traefik-session-test"

  serversTransports:
    myTransport:
      forwardingTimeouts:
        responseHeaderTimeout: "300s"
        dialTimeout: "60s"
        idleConnTimeout: "300s"

middlewares.yaml

http:
  middlewares:
    traefik-dashboard-auth:
      basicAuth:
        usersFile : "/dynamic_conf/auth.passwd"
    local-network-whitelist:
      IPAllowList:
        sourceRange:
          - "172.21.21.0/24"
          - "172.87.21.0/24"
          - "10.10.180.0/24"
         

traefik.yaml

global:
  checkNewVersion: false
  sendAnonymousUsage: false

log:
  level: DEBUG
  maxSize: 100
  maxBackups: 10
  maxAge: 10
  
api: {}

entrypoints:
  web: 
    address: :80
  websecure:
    address: :443

certificatesResolvers:
  cloudflare:
    acme:
      email: mycloudflaremail@mydomain.com
      storage: /var/traefik/certs/cloudflare-acme.json
      caServer: "https://acme-v02.api.letsencrypt.org/directory"
      keyType: EC256
      dnsChallenge:
        provider: cloudflare
        resolvers:
          - "1.1.1.1:53"
          - "8.8.8.8:53"

providers:
  docker:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false
  file: 
    directory: /dynamic_conf
    watch: true

accessLog:
  filePath: "/var/log/traefik/access.log"
  format: clf                          
  bufferingSize: 100                     
  fields:
    defaultMode: keep
    headers:
      defaultMode: keep                 
  
metrics:
  prometheus:
    addEntryPointsLabels: true
    addServicesLabels: true
    buckets:
      - 0.1
      - 0.3
      - 1.2
      - 5.0

The problem:

When browsing the website hosted on the Windows Server (EpiServer/Optimizely), I frequently encounter timeouts from Traefik, especially during heavier loads, such as publishing a page in Optimizely.

I've tried various solutions but can't seem to resolve the issue. Do you have any tips or suggestions on how I can ensure my site hosted on the Windows Server works smoothly through Traefik? Other Docker services controlled by Traefik are working fine.

Errors from the Traefik log:

2025-01-20T12:34:11Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 504 Gateway Timeout error="dial tcp 10.10.180.41:80: i/o timeout"
2025-01-20T12:36:03Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 504 Gateway Timeout error="net/http: timeout awaiting response headers"

On the Windows Server, I don't see any error messages. I can browse the site without issues using something like http://localhost directly on the Windows Server, even when Traefik has timed out. So the issue appears to be isolated to Traefik.