We are running FleetDM (v4.84.2) behind Traefik (v3.7.0) in a clustered Docker Swarm environment.
The FleetDM UI and standard HTTP API requests load successfully over HTTPS on the standard websecure entrypoint (:443). However, Live Queries fail immediately . The persistent WebSocket (wss://) handshake cannot establish a clean connection, forcing immediate drops or continuous reconnection failure loops.
Interesting Finding: When isolating an independent entrypoint explicitly for Fleet (e.g., fleetport on :8085) without generic web middlewares or stream limits, the WebSocket connection for Live Queries works perfectly . The failure occurs only when routing through the shared production websecure entrypoint on port 443.
2. Environment Architecture & Stack Details
Orchestrator: Docker Swarm (Global mode deployment for Traefik on managers)
Ingress/Proxy Engine: Traefik v3.7.0
Application Server: FleetDM v4.84.2 (using Redis 7.2.4 state store)
Host Interface Configuration: Traefik ports are bound utilizing mode: host to skip standard swarm routing constraints where possible.
Network Drivers: Overlay network (traefik-net) configured as external/attachable.
Share your full Traefik static and dynamic config.
cocasin:
We are running FleetDM (v4.84.2) behind Traefik (v3.7.0) in a clustered Docker Swarm environment.
The FleetDM UI and standard HTTP API requests load successfully over HTTPS on the standard websecure entrypoint (:443). However, Live Queries fail immediately . The persistent WebSocket (wss://) handshake cannot establish a clean connection, forcing immediate drops or continuous reconnection failure loops.
Interesting Finding: When isolating an independent entrypoint explicitly for Fleet (e.g., fleetport on :8085) without generic web middlewares or stream limits, the WebSocket connection for Live Queries works perfectly . The failure occurs only when routing through the shared production websecure entrypoint on port 443.
2. Environment Architecture & Stack Details
Orchestrator: Docker Swarm (Global mode deployment for Traefik on managers)
Ingress/Proxy Engine: Traefik v3.7.0
Application Server: FleetDM v4.84.2 (using Redis 7.2.4 state store)
Host Interface Configuration: Traefik ports are bound utilizing mode: host to skip standard swarm routing constraints where possible.
Network Drivers: Overlay network (traefik-net) configured as external/attachable.
I didn't attach the dynamic file config on my fleet service. but this is the compose file for my traefik and fleet
Fleet Service YML:
```
version: "3.8"
networks:
fleet-net:
driver: overlay
attachable: true
traefik-net:
external: true
name: traefik-net
secrets:
fleet-database-password:
external: true
x-default-opts:
&default-opts
deploy:
mode: replicated
replicas: 1
placement:
max_replicas_per_node: 1
restart_policy:
condition: on-failure
max_attempts: 3
resources:
limits:
memory: 256M
reservations:
memory: 50M
services:
redis:
<<: *default-opts
image: psbnexus:5002/redis:7.2.4-alpine3.19
networks:
- fleet-net
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 50M
fleet:
<<: *default-opts
image: fleetdm/fleet:v4.84.2
command: sh -c "/usr/bin/fleet prepare db --no-prompt && /usr/bin/fleet serve"
ports:
- target: 9010
published: 9010
mode: host
deploy:
labels:
- "traefik.enable=true"
- "traefik.docker.network=traefik-net"
- "traefik.http.routers.fleet.rule=Host(`<host>`)"
- "traefik.http.routers.fleet.entrypoints=websecure"
- "traefik.http.routers.fleet.tls=true"
- "traefik.http.routers.fleet.service=fleet-service"
- "traefik.http.services.fleet-service.loadbalancer.server.port=9010"
networks:
- fleet-net
- traefik-net
secrets:
- fleet-database-password
environment:
FLEET_MYSQL_ADDRESS: <>:3306
FLEET_MYSQL_DATABASE: osquery
FLEET_MYSQL_USERNAME: jarvis
FLEET_MYSQL_PASSWORD_PATH: /run/secrets/fleet-database-password
FLEET_REDIS_ADDRESS: redis:6379
FLEET_SERVER_ADDRESS: '0.0.0.0:9010'
FLEET_SERVER_TLS: 'false'
FLEET_SERVER_PUBLIC_URL: 'https://<hostname>'
FLEET_LIVE_QUERY_URL: 'http://<hostname>'
FLEET_CORS_ALLOWED_ORIGINS: 'https://<hostname>,http://<hostname>'
FLEET_LOGGING_JSON: "true"
FLEET_ACTIVITY_ENABLE_AUDIT_LOG: "true"
FLEET_FILESYSTEM_ENABLE_LOG_ROTATION: "true"
FLEET_LOGGING_DEBUG: "true"
FLEET_VULNERABILITIES_CURRENT_INSTANCE_CHECKS: "yes"
FLEET_VULNERABILITIES_DISABLE_SCHEDULE: "true"
FLEET_SESSION_DURATION: 15m
FLEET_DISABLE_LOGIN_FORM_AUTOCOMPLETE: 1
volumes:
[]
Traefik Serice YML:
version: "3.8"
networks:
default:
name: traefik-net
driver: overlay
attachable: true
services:
traefik:
image: traefik:3.7.0
container_name: traefik
deploy:
mode: global
placement:
constraints:
- "node.role==manager"
restart_policy:
condition: on-failure
max_attempts: 3
labels:
- "traefik.enable=true"
- "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"
- "traefik.http.middlewares.sslheader.headers.isdevelopment=false"
- "traefik.http.routers.traefik.middlewares=sslheader"
- "traefik.http.routers.traefik.rule=Host(`<host>`)"
- "traefik.http.routers.traefik.entrypoints=websecure"
- "traefik.http.routers.traefik.tls=true"
- "traefik.http.routers.traefik.service=api@internal"
- "traefik.http.services.traefik.loadbalancer.server.port=8080"
command:
- "--global.checknewversion=false"
- "--global.sendanonymoususage=false"
- "--accesslog=true"
- "--log.level=DEBUG"
- "--log.format=json"
- "--api=true"
- "--api.dashboard=true"
- "--api.debug=false"
- "--api.insecure=true"
- "--serverstransport.insecureskipverify=true"
- "--providers.http=false"
- "--providers.docker=false"
- "--providers.file.filename=/etc/traefik/dynamic.yml"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--entrypoints.metrics.address=:8082"
- "--entrypoints.fleetport.address=:8085"
- "--metrics.prometheus.entryPoint=metrics"
- "--metrics.prometheus=true"
- "--metrics.prometheus.addEntryPointsLabels=true"
- "--metrics.prometheus.addServicesLabels=true"
- "--metrics.prometheus.manualRouting=true"
- "--entrypoints.websecure.http2.maxconcurrentstreams=0"
- "--entrypoints.websecure.transport.respondingtimeouts.readtimeout=3600s"
- "--entrypoints.websecure.transport.respondingtimeouts.writetimeout=3600s"
- "--entrypoints.websecure.transport.respondingtimeouts.idletimeout=3600s"
- "--entrypoints.fleetport.transport.respondingtimeouts.readtimeout=3600s"
- "--entrypoints.fleetport.transport.respondingtimeouts.writetimeout=3600s"
- "--entrypoints.fleetport.transport.respondingtimeouts.idletimeout=3600s"
- "--serverstransport.forwardingtimeouts.dialtimeout=30s"
- "--serverstransport.forwardingtimeouts.responseheadertimeout=0s"
- "--serverstransport.forwardingtimeouts.idleconntimeout=90s"
- "--providers.swarm=true"
- "--providers.swarm.endpoint=unix:///var/run/docker.sock"
- "--providers.swarm.network=traefik-net"
- "--providers.swarm.exposedbydefault=false"
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 443
published: 443
protocol: tcp
mode: host
- target: 8085
published: 8085
protocol: tcp
mode: host
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
- "/data/certs:/etc/certs:ro"
- "/data/traefik/traefik.yml:/etc/traefik/dynamic.yml"
Use 3 backticks before and after code/config in posts to make it more readable and preserve spacing, which is important in yaml.
Thanks. Do you have any idea on this issue. I already revise the format.
Maybe the mistake his here:
FLEET_SERVER_PUBLIC_URL: 'https://fleet-staging.psbank.com.ph'
FLEET_LIVE_QUERY_URL: 'http://fleet-stagingpsbank.com.ph'
I've already tried removing that too still not working.
Did you see there is a dot missing in the second domain name?