Random CORS/504 Gateway Timeout Issues with Docker Compose Setup

We're experiencing intermittent CORS and 504 Gateway Timeout errors with our Docker Compose setup. The issue appears randomly - sometimes the admin API works while the regular API fails, and vice versa. This is a simple rented VPC with exposed port 80, and a Cloudflare Proxied DNS infront of it.

TLDR: Have two exact services that server different users, one admin, one regular api, same codebase, one will fail the other will work as intended. If I restart traefik several times, then the other will start working and the first one will stop. Seemingly I get random results, one works or the other, and I cannot figure out why is it happening.

Environment

  • Docker Compose with Traefik v3.4 reverse proxy
  • Two API services: api and admin-api (same Docker image, different configurations)
  • Frontend services: admin and customer
  • PostgreSQL database
  • Cloudflare proxy in front of services

Configuration Details

Docker Compose Services

services:
  traefik:
    image: traefik:v3.4
    command:
      - --providers.docker=true
      - --providers.docker.exposedbydefault=false
      - --entrypoints.web.address=:80
      - --api.insecure=true
      - --log.level=INFO
      - --accesslog=true
    ports:
      - "80:80"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - proxy

  api:
    image: registry.example.com/geodesia/api:${FROM_ENV}
    environment:
      NODE_ENV: production
      ADMIN_MODE: false
      BASE_URL: https://api.example.com
      ALLOWED_ORIGINS: https://app.example.com
    networks:
      - backend
      - proxy
    labels:
      - traefik.enable=true
      - traefik.http.routers.api.rule=Host(`api.example.com`)
      - traefik.http.routers.api.entrypoints=web
      - traefik.http.services.api.loadbalancer.server.port=${API_PORT:-80}
      - traefik.http.routers.api.middlewares=cors-headers
      - traefik.http.middlewares.cors-headers.headers.accesscontrolallowcredentials=true
      - traefik.http.middlewares.cors-headers.headers.accesscontrolallowheaders=Content-Type,Authorization
      - traefik.http.middlewares.cors-headers.headers.accesscontrolallowmethods=GET,POST,PUT,DELETE,OPTIONS
      - traefik.http.middlewares.cors-headers.headers.accesscontrolalloworiginlist=https://app.example.com,https://admin.example.com

  admin-api:
    image: registry.example.com/geodesia/api:${FROM_ENV}
    environment:
      NODE_ENV: production
      ADMIN_MODE: true
      BASE_URL: https://admin-api.example.com
      ALLOWED_ORIGINS: https://admin.example.com
    networks:
      - backend
      - proxy
    labels:
      - traefik.enable=true
      - traefik.http.routers.admin-api.rule=Host(`admin-api.example.com`)
      - traefik.http.routers.admin-api.entrypoints=web
      - traefik.http.services.admin-api.loadbalancer.server.port=${API_PORT:-80}
      - traefik.http.routers.admin-api.middlewares=cors-headers
      - traefik.http.middlewares.cors-headers.headers.accesscontrolallowcredentials=true
      - traefik.http.middlewares.cors-headers.headers.accesscontrolallowheaders=Content-Type,Authorization
      - traefik.http.middlewares.cors-headers.headers.accesscontrolallowmethods=GET,POST,PUT,DELETE,OPTIONS
      - traefik.http.middlewares.cors-headers.headers.accesscontrolalloworiginlist=https://app.example.com,https://admin.example.com

Symptom Examples

Failing Request (Regular API)

Request URL: https://api.example.com/api/auth/get-session
Request Method: GET
Status Code: 504 Gateway Timeout
Origin: https://app.example.com

Successful Request (Admin API)

Request URL: https://admin-api.example.com/api/auth/get-session
Request Method: GET
Status Code: 200 OK
Origin: https://admin.example.com
Response Headers:
access-control-allow-credentials: true
access-control-allow-origin: https://admin.example.com

Investigation Steps Taken

  1. Container Status: Both API containers are running and healthy

    docker ps | grep api
    # Shows both api-1 and admin-api-1 running
    
  2. Container Logs: Both services show successful startup

    • Both containers show "Starting server on port 80"
    • Database connectivity established
    • S3 connectivity verified
    • No error messages in logs
  3. Network Configuration: Both containers connected to required networks

    docker network ls | grep -E "(proxy|backend)"
    e19c0cd203be   geodesia_backend   bridge    local
    dda7c7cf01a1   geodesia_proxy     bridge    local
    
  4. Container Inspection: Traefik labels properly configured for both services

    Regular API Container Labels:

    "Labels": {
        "com.docker.compose.config-hash": "359e22fdb80b58be78cf625833a2b935f61daa65b489dbd2ca64af8929dcc802",
        "com.docker.compose.service": "api",
        "traefik.enable": "true",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolallowcredentials": "true",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolallowheaders": "Content-Type,Authorization",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolallowmethods": "GET,POST,PUT,DELETE,OPTIONS",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolalloworiginlist": "https://app.example.com,https://admin.example.com",
        "traefik.http.routers.api.entrypoints": "web",
        "traefik.http.routers.api.middlewares": "cors-headers",
        "traefik.http.routers.api.rule": "Host(`api.example.com`)",
        "traefik.http.services.api.loadbalancer.server.port": "80"
    }
    

    Admin API Container Labels:

    "Labels": {
        "com.docker.compose.config-hash": "8723a3e93ccd767322dd49d6ed7e2c0906c931bf9d97e813fdf7721242ec27ad",
        "com.docker.compose.service": "admin-api",
        "traefik.enable": "true",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolallowcredentials": "true",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolallowheaders": "Content-Type,Authorization",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolallowmethods": "GET,POST,PUT,DELETE,OPTIONS",
        "traefik.http.middlewares.cors-headers.headers.accesscontrolalloworiginlist": "https://app.example.com,https://admin.example.com",
        "traefik.http.routers.admin-api.entrypoints": "web",
        "traefik.http.routers.admin-api.middlewares": "cors-headers",
        "traefik.http.routers.admin-api.rule": "Host(`admin-api.example.com`)",
        "traefik.http.services.admin-api.loadbalancer.server.port": "80"
    }
    

    Network Configuration:

    "NetworkSettings": {
        "Networks": {
            "geodesia_backend": {
                "IPAddress": "172.19.0.3",
                "Aliases": ["geodesia-api-1", "api"]
            },
            "geodesia_proxy": {
                "IPAddress": "172.18.0.3",
                "Aliases": ["geodesia-api-1", "api"]
            }
        }
    }
    
  5. Direct Container Access: Services respond internally but not externally

    • Regular API: Times out when accessed via curl
    • Admin API: Responds successfully when accessed via curl

Temporary Workaround

Restarting Docker services temporarily resolves the issue for one of the services, I can't get both to work :

docker compose restart

Additional Context

  • The issue appears to be completely random - no specific triggers identified
  • Both services use the same Docker image with different environment variables
  • The problem affects both CORS headers and basic connectivity
  • Manual service restarts temporarily resolve the issue for one of the services, but don't prevent recurrence

Three recommendations:

  1. Use latest Traefik v3.5
  2. Define docker.network on providers.docker to tell Traefik which network to use
  3. Do not define a middleware with the same name multiple times
1 Like