There are lots of discussions about needing to restart traefik whenever another container is reconfigured (with labels in compose.yml file for instance).
However, this also happens to me whenever I simply decide to reboot/restart a container.
podman restart uptime-kuma
If I visit the required URL, I get Bad Gateway
.
Logs show
time="2024-01-22T11:01:34Z" level=debug msg="'502 Bad Gateway' caused by: dial tcp 10.89.0.105:3001: connect: no route to host"
(the other URLs for the non-affected containers work without any issue)
If I do this instead then it works correctly:
podman restart uptime-kuma
systemctl --user restart container-traefik
I use the "direct" podman command if there is not yet a Systemd Service installed.
In case a Systemd Service is installed however, running the "direct" podman command causes issues.
Is this intended/expected behaviour ?
I would expect the containers to simply reconnect. Or it's an issue of IP changing or DNS naming resolution not working correctly after "just" the container (not traefik) restart ?
This is for uptime-kuma but I recall it occurred pretty much with every container
compose.yml
version: '3'
services:
uptime-kuma:
container_name: uptime-kuma
image: louislam/uptime-kuma:latest
volumes:
- ~/data/uptime-kuma:/app/data
ports:
- 3001:3001
restart: unless-stopped
networks:
- podman
- traefik
capabilities: {CAP_NET_RAW,CAP_NET_BIND_SERVICE}
labels:
- "traefik.enable=true"
- "traefik.http.routers.uptime-kuma.rule=Host(`uptime-kuma.MYDOMAIN.TLD`)"
- "traefik.http.services.uptime-kuma.loadbalancer.server.port=3001"
- "traefik.docker.network=traefik"
networks:
podman:
external: true
traefik:
external: true
EDIT: it seems indeed that the IP address of uptime-kuma
is changing every time (both on the main podman
network as well as on the traefik
network).
podman is not Docker. Traefik has Configuration Discovery for Docker (doc), not sure how compatible podman is to the Docker API.
The error indicates the IP is not correctly updated in Traefik after the target service restart. Is this just for a few seconds or permanently? Enable Traefik debug log (doc).
Share your full Traefik static and dynamic config, and docker-compose.yml
if used.
I don't think I got a notification of your reply. Sorry for the delay.
Podman should be fairly compatible with Docker, as far as I can understand.
The issue is permanent. Traefik must be ALWAYS the LAST container that is restarted otherwise it won't work. Sometimes I get a 404 not found error message when that happens.
Cannot upload the log here, that's kinda annoying. Do we really have to upload a pastebin to an external site for it ?
version: '3.9'
networks:
traefik:
external: true
services:
traefik:
image: traefik:latest
hostname: traefik
restart: always
container_name: traefik
ports:
- "80:80"
- "443:443"
- "465:465"
networks:
- traefik
volumes:
- /run/user/1001/podman/podman.sock:/var/run/docker.sock:ro
- ~/config/traefik/dynamic:/etc/traefik/dynamic
- ~/certificates/letsencrypt:/certificates
- ~/log/traefik:/log
command:
## Logging
# Server Log
- "--log.level=DEBUG"
# - "--log.level=INFO"
- "--log.filePath=/log/server/traefik.log"
# Access Log
- "--accesslog=true"
- "--accesslog.filePath=/log/access/access.log"
## Dashboard & API
- "--api"
- "--api.insecure=false" # production = false , development = true
- "--api.dashboard=true"
## EntryPoints
# HTTP - Unsecure Connection - Redirect to Secure
- "--entrypoints.web.address=:80"
- "--entrypoints.web.http.redirections.entrypoint.to=websecure"
- "--entryPoints.web.http.redirections.entrypoint.scheme=https"
- "--entrypoints.web.http.redirections.entrypoint.permanent=true"
# HTTPs - Secure Connection
- "--entrypoints.websecure.address=:443"
- "--entrypoints.websecure.http.tls=true"
# - "--entrypoints.websecure.http.tls.certresolver=letsencrypt"
- "--entryPoints.websecure.transport.respondingTimeouts.readTimeout=420"
- "--entryPoints.websecure.transport.respondingTimeouts.writeTimeout=420"
- "--entryPoints.websecure.transport.respondingTimeouts.idleTimeout=420"
# TCP - Email
- "--entrypoints.mailsecure.address=:465"
# TCP - Zabbix
# - "--entryPoints.zabbix-tcp.address=:10051"
# UDP - Zabbix
# - "--entryPoints.zabbix-udp.address=:10051/udp"
## Docker / Podman Intergration
- "--providers.docker=true"
- "--providers.docker.exposedByDefault=false"
- "--providers.docker.watch=false"
- "--providers.docker.swarmMode=false"
- "--providers.docker.endpoint=unix:///var/run/docker.sock"
# Use Dynamic Configuration
- "--providers.file=true"
- "--providers.file.directory=/etc/traefik/dynamic"
# Crowdsec Integration
#- "--providers.file=true"
#- "--providers.file.filename=/config/config.yml"
## Other
# ...
- "--serversTransport.insecureSkipVerify=true"
# No Telemetry
- "--global.sendAnonymousUsage=false"
labels:
# Enable Traefik
- "traefik.enable=true"
# Dashboard
# - "traefik.http.routers.dashboard.entrypoint=websecure" # !! If enabled, this line causes a 404 page not found for the dashboard !!
- "traefik.http.routers.dashboard.rule=Host(`podmanserver16.MYDOMAIN.TLD`) && PathPrefix(`/api` , `/dashboard`)"
- "traefik.http.routers.dashboard.service=api@internal"
~/config/traefik/traefik.yml is empty
~/config/traefik/dynamic/certificates.yml
tls:
# stores:
# default:
# defaultCertificate:
# certFile: /certificates/MYDOMAIN.TLD/cert.pem
# keyFile: /certificates/MYDOMAIN.TLD/privkey.pem
certificates:
- certFile: /certificates/MYDOMAIN.TLD/cert.pem
keyFile: /certificates/MYDOMAIN.TLD/privkey.pem
Crowdsec is disabled at the moment but here you go
~/config/traefik/dynamic/crowdsec-bouncer.yml
#http:
# middlewares:
# crowdsec-bouncer:
# forwardauth:
# address: http://bouncer-traefik:8080/api/v1/forwardAuth
# trustForwardHeader: true
And, if it's not sufficiently clear, I must MANUALLY restart traefik.
I played around a bit with Docker events but it didn't really seem to work (actually it might create a loop, where the traefik container triggers a Docker/Podman event, which trigger another restart etc).
I don't remember on which Server I tested that and what it did exactly, but that was the behavior I was observing.
Should it rather be this instead ?
--providers.docker.watch=true
Sure. But I got the impression it’s only relevant for Docker Swarm (doc).
Is there a way you could test this with real Docker to see if it’s a Podman compatibility issue or an issue with Traefik?
I just tried but watch
by itself doesn't do anything. The Doc Section doesn't say anything about Docker Swarm though.
I cannot really move everything from Podman to Docker and back ...
Not really a solution but rather a (maybe incomplete) workaround.
Basically always ensure that traefik is the LATEST container to be (re)started.
Note: my containers are named $container
and the associated Systemd service is named container-$container.service
. For example container-dashy.service
is the Systemd service for the dashy
container.
- Create
/home/podman/bin/monitor-traefik.sh:
#!/bin/bash
while true
do
# List Containers
mapfile -t list < <( podman ps --all --format="{{.Names}}" )
# Get current epoch time
now=$(date +%s)
# Get past epoch Time in which traefik was started (constant value)
traefik_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=traefik)
# Get traefik running duration
traefik_duration_s=$((now-traefik_startedat))
# Define if traefik must be restarted
traefik_restart=0
for container in "${list[@]}"
do
# Get past epoch Time in which the container was started (constant value)
container_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=$container)
# Get container running duration
container_duration_s=$((now-container_startedat))
# Compare against traefik started time
if [[ ${traefik_startedat} -lt ${container_startedat} ]]
then
echo "Container $container was started AFTER traefik Proxy Server. Restarting Traefik Necessary"
traefik_restart=1
fi
done
if [[ ${traefik_restart} -gt 0 ]]
then
# Restart traefik container
echo "Restarting traefik container"
systemctl --user restart container-traefik
fi
# Wait a bit
sleep 15
done
-
Give execution permissions: chmod +x /home/podman/bin/monitor-traefik.sh
-
Create a very basic Systemd service if you are using Systemd
nano ~/.config/systemd/user/monitor-traefik.service
-
Enter the following
[Unit]
Description=Traefik Monitoring and Restarting Tool
[Service]
ExecStart=/bin/bash -c '~/bin/monitor-traefik.sh'
#ExecStop=/bin/bash -c '~/bin/monitor-traefik-stop.sh'
[Install]
WantedBy=default.target
-
Reload Systemd Service Files: systemctl --user daemon-reload
-
Enable the Service to start automatically at each boot: systemctl --user enable monitor-traefik.service
-
Start the Service: systemctl --user restart monitor-traefik.service
-
Verify the Status is OK: systemctl --user status monitor-traefik.service
-
Check the logs from time to time and in case of issues: journalctl --user -xeu monitor-traefik.service
Updated /home/podman/bin/monitor-traefik.sh
able to handle container names that otherwise would be flagged as "duplicate" resulting in a script error (e.g. I had peppermint
and peppermint_postgres
, which would result in an error if I wasn't using the regex ^$container$ in podman ps --all --format="{{.StartedAt}}" --filter name=^$container\$
)
#!/bin/bash
while true
do
# List Containers
mapfile -t list < <( podman ps --all --format="{{.Names}}" )
#formatted=""
# Get current epoch time
now=$(date +%s)
# Get past epoch Time in which traefik was started (constant value)
traefik_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=traefik)
# Get traefik running duration
traefik_duration_s=$((now-traefik_startedat))
# Define if traefik must be restarted
traefik_restart=0
for container in "${list[@]}"
do
# Echo
#echo "Processing container <$container>"
# Get past epoch Time in which the container was started (constant value)
container_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=^$container\$)
#started=${container_startedat}
# Get container running duration
container_duration_s=$((now-container_startedat))
# Compare against traefik started time
#echo "if [[ ${traefik_startedat} -lt ${container_startedat} ]]"
if [[ ${traefik_startedat} -lt ${container_startedat} ]]
then
echo "Container $container was started AFTER traefik Proxy Server. Restarting Traefik Necessary"
traefik_restart=1
fi
# Transformer into hours
#container_duration_h=$(echo "scale=0; ${container_duration_s}/3600" | bc)
# Format it
#formatted="${formatted}${container}|${container_duration_s} s|${container_duration_h} h \n"
done
#echo ${formatted} | column -t -s$'\t'
#echo -e ${formatted} | column -t -s "|"
if [[ ${traefik_restart} -gt 0 ]]
then
# Restart traefik container
echo "Restarting traefik container"
systemctl --user restart container-traefik
fi
# Wait a bit
sleep 15
done
Actually I think I just discovered that if I add
- "--providers.docker.allowEmptyServices=true"
In the traefik compose.yml
file under Docker Providers, it seems to work. It's picking up containers started even AFTER the last traefik restart ....
@bluepuma77 : is this a BUG or a "feature" ? I'm pretty sure that most of my containers do NOT have the healthcheck monitoring enabled ...
My understanding is that Traefik providers.docker
will at first list all containers via Docker API, then continuously listen for Docker events to register and de-register targets.
If new services/containers are not picked up after a Traefik restart, then it’s probably a compatibility issue with Docker events and podman.
If Traefik providers.docker
doesn’t work with podman, then you are probably requesting a feature. In Traefik v3, there is even a new separate providers.swarm
for Docker Swarm.
No, you don't understand.
I am saying that WITH:
- "--providers.docker.allowEmptyServices=true"
Then it DOES WORK with Podman.
It might be something not required with Docker, I agree with you on that.
But it seems a bit easier than my "Hack" Script to Restart Traefik automatically each time something triggered a restart of any other Container.