Sometimes (about 1/3 of the time) when I deploy to my server from my GitHub Actions pipeline it fails and I get a 503 Service Unavailable
response from my server.
When I enabled debug logging I saw the following warning:
2025-04-15T17:09:44Z WRN github.com/traefik/traefik/v3/pkg/healthcheck/healthcheck.go:125 > Health check failed. error="HTTP request failed: Get \"http://10.0.0.14:80/\": context deadline exceeded" serviceName=frontend@docker targetURL=http://10.0.0.14:80
My application stack is configured to use Docker Swarm and I deploy with the following command:
$ docker stack deploy --with-registry-auth -c /opt/clever-cash/compose.base.yml -c /opt/clever-cash/stack.test.yml my-app
What can I do to get my deployments to become 100% robust?
Thank you for reading my post and looking forward to your insights
Cheers,
Florestan
PS: I've attached the full stack trace at the end of this post
compose.yml
services:
frontend:
container_name: frontend
restart: on-failure
ports:
- 4200:80
image: frontend
build:
context: ../../../
cache_from:
- frontend
dockerfile: ./frontend/Dockerfile
logging:
driver: loki
options:
loki-url: /run/secrets/jwt_expire_in/loki-url
loki-retries: 2
loki-max-backoff: 800ms
loki-timeout: 10s
keep-file: 'true'
mode: non-blocking
healthcheck:
test: ['CMD', 'curl', '-f', 'http://localhost:80/']
interval: 10s
timeout: 5s
retries: 3
start_period: 15s
stack.test.yml
services:
traefik:
image: traefik
command:
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --certificatesresolvers.dnsresolver.acme.dnschallenge=true
- --certificatesresolvers.dnsresolver.acme.dnschallenge.provider=cloudflare
- --certificatesresolvers.dnsresolver.acme.email=admin@example.com
- --certificatesresolvers.dnsresolver.acme.dnschallenge.delaybeforecheck=0
- --certificatesresolvers.dnsresolver.acme.storage=/letsencrypt/acme.json
- --entryPoints.websecure.address=:443
- --entryPoints.web.address=:80
- --entrypoints.web.http.redirections.entrypoint.to=websecure
- --entrypoints.web.http.redirections.entrypoint.scheme=https
- --serversTransport.forwardingTimeouts.dialTimeout=30s
- --log.level=DEBUG
ports:
- 80:80
- 443:443
environment:
- CF_DNS_API_TOKEN_FILE=/run/secrets/cloudflare_dns_api_token
secrets:
- cloudflare_dns_api_token
volumes:
- letsencrypt:/letsencrypt
- /var/run/docker.sock:/var/run/docker.sock
deploy:
replicas: 1
update_config:
parallelism: 1
delay: 10s
order: stop-first
failure_action: rollback
restart_policy:
condition: any
# FRONTEND
frontend:
image: frontend:latest
labels:
- traefik.enable=true
- traefik.http.middlewares.frontend-retry.retry.attempts=5
- traefik.http.middlewares.frontend-retry.retry.initialinterval=100ms
# Match healthcheck timeout to container startup time
- traefik.http.services.frontend.loadbalancer.healthcheck.path=/
- traefik.http.services.frontend.loadbalancer.healthcheck.interval=10s
- traefik.http.services.frontend.loadbalancer.healthcheck.timeout=8s
# Link the middleware to the router (?)
- traefik.http.routers.frontend-router.middlewares=frontend-retry@docker
- traefik.http.routers.frontend-router.rule=Host(`example.com`)
- traefik.http.routers.frontend-router.entrypoints=websecure
- traefik.http.routers.frontend-router.tls.certresolver=dnsresolver
- traefik.http.routers.frontend-router.service=frontend
extra_hosts:
- host.docker.internal:host-gateway
volumes:
letsencrypt:
Full log output (log level DEBUG)
2025-04-15T17:08:56Z DBG github.com/traefik/traefik/v3/pkg/provider/acme/provider.go:984 > No ACME certificate generation required for domains ACME CA=https://acme-v02.api.letsencrypt.org/directory acmeCA=https://acme-v02.api.letsencrypt.org/directory domains=["example.com"] providerName=dnsresolver.acme routerName=frontend-router@docker rule=Host(`example.com`)
2025-04-15T17:09:07Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:213 > Service selected by WRR: http://10.0.0.14:80
2025-04-15T17:09:08Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:213 > Service selected by WRR: http://10.0.0.14:80
2025-04-15T17:09:08Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:213 > Service selected by WRR: http://10.0.0.14:80
2025-04-15T17:09:08Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:213 > Service selected by WRR: http://10.0.0.14:80
2025-04-15T17:09:13Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:213 > Service selected by WRR: http://10.0.0.14:80
2025-04-15T17:09:14Z WRN github.com/traefik/traefik/v3/pkg/healthcheck/healthcheck.go:125 > Health check failed. error="HTTP request failed: Get \"http://10.0.0.14:80/\": context deadline exceeded" serviceName=frontend@docker targetURL=http://10.0.0.14:80
2025-04-15T17:09:14Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:146 > Setting status of http://10.0.0.14:80 to DOWN serviceName=frontend@docker
2025-04-15T17:09:14Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:168 > Propagating new DOWN status serviceName=frontend@docker
2025-04-15T17:09:15Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:121 > 499 Client Closed Request error="context canceled"
2025-04-15T17:09:15Z DBG github.com/traefik/traefik/v3/pkg/middlewares/retry/retry.go:177 > Final retry attempt failed error="context canceled" middlewareName=frontend-retry@docker middlewareType=Retry
2025-04-15T17:09:24Z WRN github.com/traefik/traefik/v3/pkg/healthcheck/healthcheck.go:125 > Health check failed. error="HTTP request failed: Get \"http://10.0.0.14:80/\": context deadline exceeded" serviceName=frontend@docker targetURL=http://10.0.0.14:80
2025-04-15T17:09:24Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:146 > Setting status of http://10.0.0.14:80 to DOWN serviceName=frontend@docker
2025-04-15T17:09:24Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:163 > Still DOWN, no need to propagate serviceName=frontend@docker
2025-04-15T17:09:34Z WRN github.com/traefik/traefik/v3/pkg/healthcheck/healthcheck.go:125 > Health check failed. error="HTTP request failed: Get \"http://10.0.0.14:80/\": context deadline exceeded" serviceName=frontend@docker targetURL=http://10.0.0.14:80
2025-04-15T17:09:34Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:146 > Setting status of http://10.0.0.14:80 to DOWN serviceName=frontend@docker
2025-04-15T17:09:34Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:163 > Still DOWN, no need to propagate serviceName=frontend@docker
2025-04-15T17:09:37Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:121 > 504 Gateway Timeout error="dial tcp 10.0.0.14:80: i/o timeout"
2025-04-15T17:09:37Z DBG github.com/traefik/traefik/v3/pkg/middlewares/retry/retry.go:170 > New attempt 2 for request: /api/users/profile middlewareName=frontend-retry@docker middlewareType=Retry
2025-04-15T17:09:38Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:121 > 504 Gateway Timeout error="dial tcp 10.0.0.14:80: i/o timeout"
2025-04-15T17:09:38Z DBG github.com/traefik/traefik/v3/pkg/middlewares/retry/retry.go:170 > New attempt 2 for request: /api/users/profile middlewareName=frontend-retry@docker middlewareType=Retry
2025-04-15T17:09:38Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:121 > 504 Gateway Timeout error="dial tcp 10.0.0.14:80: i/o timeout"
2025-04-15T17:09:38Z DBG github.com/traefik/traefik/v3/pkg/middlewares/retry/retry.go:170 > New attempt 2 for request: /ngsw-worker.js middlewareName=frontend-retry@docker middlewareType=Retry
2025-04-15T17:09:43Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:121 > 504 Gateway Timeout error="dial tcp 10.0.0.14:80: i/o timeout"
2025-04-15T17:09:43Z DBG github.com/traefik/traefik/v3/pkg/middlewares/retry/retry.go:170 > New attempt 2 for request: /ngsw.json?ngsw-cache-bust=0.8821232093868159 middlewareName=frontend-retry@docker middlewareType=Retry
2025-04-15T17:09:44Z WRN github.com/traefik/traefik/v3/pkg/healthcheck/healthcheck.go:125 > Health check failed. error="HTTP request failed: Get \"http://10.0.0.14:80/\": context deadline exceeded" serviceName=frontend@docker targetURL=http://10.0.0.14:80
2025-04-15T17:09:44Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:146 > Setting status of http://10.0.0.14:80 to DOWN serviceName=frontend@docker
2025-04-15T17:09:44Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:163 > Still DOWN, no need to propagate serviceName=frontend@docker