WSS Connection cannot be establish issue

We are running FleetDM (v4.84.2) behind Traefik (v3.7.0) in a clustered Docker Swarm environment.
The FleetDM UI and standard HTTP API requests load successfully over HTTPS on the standard websecure entrypoint (:443). However, Live Queries fail immediately. The persistent WebSocket (wss://) handshake cannot establish a clean connection, forcing immediate drops or continuous reconnection failure loops.

  • Interesting Finding: When isolating an independent entrypoint explicitly for Fleet (e.g., fleetport on :8085) without generic web middlewares or stream limits, the WebSocket connection for Live Queries works perfectly. The failure occurs only when routing through the shared production websecure entrypoint on port 443.

2. Environment Architecture & Stack Details

  • Orchestrator: Docker Swarm (Global mode deployment for Traefik on managers)

  • Ingress/Proxy Engine: Traefik v3.7.0

  • Application Server: FleetDM v4.84.2 (using Redis 7.2.4 state store)

  • Host Interface Configuration: Traefik ports are bound utilizing mode: host to skip standard swarm routing constraints where possible.

  • Network Drivers: Overlay network (traefik-net) configured as external/attachable.

Share your full Traefik static and dynamic config.

I didn't attach the dynamic file config on my fleet service. but this is the compose file for my traefik and fleet

Fleet Service YML:

```
version: "3.8"

networks:
  fleet-net:
    driver: overlay
    attachable: true
  traefik-net:
    external: true
    name: traefik-net

secrets:
  fleet-database-password:
    external: true

x-default-opts:
  &default-opts
  deploy:
    mode: replicated
    replicas: 1
    placement:
      max_replicas_per_node: 1
    restart_policy:
      condition: on-failure
      max_attempts: 3
    resources:
      limits:
        memory: 256M
      reservations:
        memory: 50M

services:
  redis:
    <<: *default-opts
    image: psbnexus:5002/redis:7.2.4-alpine3.19
    networks:
      - fleet-net
    deploy:
      resources:
        limits:
          memory: 100M
        reservations:
          memory: 50M

  fleet:
    <<: *default-opts
    image: fleetdm/fleet:v4.84.2
    command: sh -c "/usr/bin/fleet prepare db --no-prompt && /usr/bin/fleet serve"
    ports:
      - target: 9010
        published: 9010
        mode: host
    deploy:
      labels:
        - "traefik.enable=true"
        - "traefik.docker.network=traefik-net"
        - "traefik.http.routers.fleet.rule=Host(`<host>`)"
        - "traefik.http.routers.fleet.entrypoints=websecure"
        - "traefik.http.routers.fleet.tls=true"
        - "traefik.http.routers.fleet.service=fleet-service"
        - "traefik.http.services.fleet-service.loadbalancer.server.port=9010"
    networks:
      - fleet-net
      - traefik-net
    secrets:
      - fleet-database-password
    environment:
      FLEET_MYSQL_ADDRESS: <>:3306
      FLEET_MYSQL_DATABASE: osquery
      FLEET_MYSQL_USERNAME: jarvis
      FLEET_MYSQL_PASSWORD_PATH: /run/secrets/fleet-database-password
      FLEET_REDIS_ADDRESS: redis:6379
      FLEET_SERVER_ADDRESS: '0.0.0.0:9010'
      FLEET_SERVER_TLS: 'false'
      FLEET_SERVER_PUBLIC_URL: 'https://<hostname>'
      FLEET_LIVE_QUERY_URL: 'http://<hostname>'
      FLEET_CORS_ALLOWED_ORIGINS: 'https://<hostname>,http://<hostname>'
      FLEET_LOGGING_JSON: "true"
      FLEET_ACTIVITY_ENABLE_AUDIT_LOG: "true"
      FLEET_FILESYSTEM_ENABLE_LOG_ROTATION: "true"
      FLEET_LOGGING_DEBUG: "true"
      FLEET_VULNERABILITIES_CURRENT_INSTANCE_CHECKS: "yes"
      FLEET_VULNERABILITIES_DISABLE_SCHEDULE: "true"
      FLEET_SESSION_DURATION: 15m
      FLEET_DISABLE_LOGIN_FORM_AUTOCOMPLETE: 1
    volumes:
      []


Traefik Serice YML:

version: "3.8"

networks:
  default:
    name: traefik-net
    driver: overlay
    attachable: true

services:
  traefik:
    image: traefik:3.7.0  
    container_name: traefik
    deploy:
      mode: global
      placement:
        constraints:
          - "node.role==manager"
      restart_policy:
        condition: on-failure
        max_attempts: 3
      labels:
        - "traefik.enable=true"
        - "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"
        - "traefik.http.middlewares.sslheader.headers.isdevelopment=false"
        - "traefik.http.routers.traefik.middlewares=sslheader"
        - "traefik.http.routers.traefik.rule=Host(`<host>`)"
        - "traefik.http.routers.traefik.entrypoints=websecure"
        - "traefik.http.routers.traefik.tls=true"
        - "traefik.http.routers.traefik.service=api@internal"
        - "traefik.http.services.traefik.loadbalancer.server.port=8080"
    command:
      - "--global.checknewversion=false"
      - "--global.sendanonymoususage=false"
      - "--accesslog=true"
      - "--log.level=DEBUG"
      - "--log.format=json"
      - "--api=true"
      - "--api.dashboard=true"
      - "--api.debug=false"
      - "--api.insecure=true"
      - "--serverstransport.insecureskipverify=true"
      - "--providers.http=false"
      - "--providers.docker=false" 
      - "--providers.file.filename=/etc/traefik/dynamic.yml"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.metrics.address=:8082"
      - "--entrypoints.fleetport.address=:8085"
      - "--metrics.prometheus.entryPoint=metrics"
      - "--metrics.prometheus=true"
      - "--metrics.prometheus.addEntryPointsLabels=true"
      - "--metrics.prometheus.addServicesLabels=true"
      - "--metrics.prometheus.manualRouting=true"
      - "--entrypoints.websecure.http2.maxconcurrentstreams=0"
      - "--entrypoints.websecure.transport.respondingtimeouts.readtimeout=3600s"
      - "--entrypoints.websecure.transport.respondingtimeouts.writetimeout=3600s"
      - "--entrypoints.websecure.transport.respondingtimeouts.idletimeout=3600s"
      - "--entrypoints.fleetport.transport.respondingtimeouts.readtimeout=3600s"
      - "--entrypoints.fleetport.transport.respondingtimeouts.writetimeout=3600s"
      - "--entrypoints.fleetport.transport.respondingtimeouts.idletimeout=3600s"
      - "--serverstransport.forwardingtimeouts.dialtimeout=30s"
      - "--serverstransport.forwardingtimeouts.responseheadertimeout=0s"
      - "--serverstransport.forwardingtimeouts.idleconntimeout=90s"
      - "--providers.swarm=true"
      - "--providers.swarm.endpoint=unix:///var/run/docker.sock"
      - "--providers.swarm.network=traefik-net"
      - "--providers.swarm.exposedbydefault=false"
    ports:
      - target: 80
        published: 80
        protocol: tcp
        mode: host
      - target: 443
        published: 443
        protocol: tcp
        mode: host
      - target: 8085
        published: 8085
        protocol: tcp
        mode: host
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/data/certs:/etc/certs:ro"
      - "/data/traefik/traefik.yml:/etc/traefik/dynamic.yml"

Use 3 backticks before and after code/config in posts to make it more readable and preserve spacing, which is important in yaml.

Thanks. Do you have any idea on this issue. I already revise the format.

Maybe the mistake his here:

      FLEET_SERVER_PUBLIC_URL: 'https://fleet-staging.psbank.com.ph'
      FLEET_LIVE_QUERY_URL: 'http://fleet-stagingpsbank.com.ph'

I've already tried removing that too still not working.

Did you see there is a dot missing in the second domain name?