Need to restart traefik whenever a converter is restarted (even if compose.yml configuration does NOT change)

There are lots of discussions about needing to restart traefik whenever another container is reconfigured (with labels in compose.yml file for instance).

However, this also happens to me whenever I simply decide to reboot/restart a container.

podman restart uptime-kuma

If I visit the required URL, I get Bad Gateway.
Logs show

time="2024-01-22T11:01:34Z" level=debug msg="'502 Bad Gateway' caused by: dial tcp 10.89.0.105:3001: connect: no route to host"

(the other URLs for the non-affected containers work without any issue)

If I do this instead then it works correctly:

podman restart uptime-kuma
systemctl --user restart container-traefik

I use the "direct" podman command if there is not yet a Systemd Service installed.
In case a Systemd Service is installed however, running the "direct" podman command causes issues.

Is this intended/expected behaviour ?
I would expect the containers to simply reconnect. Or it's an issue of IP changing or DNS naming resolution not working correctly after "just" the container (not traefik) restart ?

This is for uptime-kuma but I recall it occurred pretty much with every container

compose.yml

version: '3'
services:
  uptime-kuma:
    container_name: uptime-kuma
    image: louislam/uptime-kuma:latest
    volumes:
      - ~/data/uptime-kuma:/app/data
    ports:
      - 3001:3001
    restart: unless-stopped
    networks:
      - podman
      - traefik
    capabilities: {CAP_NET_RAW,CAP_NET_BIND_SERVICE}
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.uptime-kuma.rule=Host(`uptime-kuma.MYDOMAIN.TLD`)"
      - "traefik.http.services.uptime-kuma.loadbalancer.server.port=3001"
      - "traefik.docker.network=traefik"

networks:
  podman:
    external: true
  traefik:
    external: true

EDIT: it seems indeed that the IP address of uptime-kuma is changing every time (both on the main podman network as well as on the traefik network).

podman is not Docker. Traefik has Configuration Discovery for Docker (doc), not sure how compatible podman is to the Docker API.

The error indicates the IP is not correctly updated in Traefik after the target service restart. Is this just for a few seconds or permanently? Enable Traefik debug log (doc).

Share your full Traefik static and dynamic config, and docker-compose.yml if used.

I don't think I got a notification of your reply. Sorry for the delay.

Podman should be fairly compatible with Docker, as far as I can understand.

The issue is permanent. Traefik must be ALWAYS the LAST container that is restarted otherwise it won't work. Sometimes I get a 404 not found error message when that happens.
Cannot upload the log here, that's kinda annoying. Do we really have to upload a pastebin to an external site for it ?

version: '3.9'

networks:
  traefik:
    external: true

services:
  traefik:
    image: traefik:latest
    hostname: traefik
    restart: always
    container_name: traefik
    ports:
      - "80:80"
      - "443:443"
      - "465:465"
    networks:
      - traefik
    volumes:
      - /run/user/1001/podman/podman.sock:/var/run/docker.sock:ro
      - ~/config/traefik/dynamic:/etc/traefik/dynamic
      - ~/certificates/letsencrypt:/certificates
      - ~/log/traefik:/log
    command:
      ## Logging
      # Server Log
      - "--log.level=DEBUG"
#      - "--log.level=INFO"
      - "--log.filePath=/log/server/traefik.log"

      # Access Log
      - "--accesslog=true"
      - "--accesslog.filePath=/log/access/access.log"

      ## Dashboard & API
      - "--api"
      - "--api.insecure=false" # production = false , development = true
      - "--api.dashboard=true"

      ## EntryPoints
      # HTTP - Unsecure Connection - Redirect to Secure
      - "--entrypoints.web.address=:80"
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entryPoints.web.http.redirections.entrypoint.scheme=https"
      - "--entrypoints.web.http.redirections.entrypoint.permanent=true"

      # HTTPs - Secure Connection
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.websecure.http.tls=true"
#      - "--entrypoints.websecure.http.tls.certresolver=letsencrypt"

      - "--entryPoints.websecure.transport.respondingTimeouts.readTimeout=420"
      - "--entryPoints.websecure.transport.respondingTimeouts.writeTimeout=420"
      - "--entryPoints.websecure.transport.respondingTimeouts.idleTimeout=420"

      # TCP - Email
      - "--entrypoints.mailsecure.address=:465"

      # TCP - Zabbix
#      - "--entryPoints.zabbix-tcp.address=:10051"

      # UDP - Zabbix
#      - "--entryPoints.zabbix-udp.address=:10051/udp"

      ## Docker / Podman Intergration
      - "--providers.docker=true"
      - "--providers.docker.exposedByDefault=false"
      - "--providers.docker.watch=false"
      - "--providers.docker.swarmMode=false"
      - "--providers.docker.endpoint=unix:///var/run/docker.sock"

      # Use Dynamic Configuration
      - "--providers.file=true"
      - "--providers.file.directory=/etc/traefik/dynamic"


      # Crowdsec Integration
      #- "--providers.file=true"
      #- "--providers.file.filename=/config/config.yml"

      ## Other
      # ...
      - "--serversTransport.insecureSkipVerify=true"

      # No Telemetry
      - "--global.sendAnonymousUsage=false"

    labels:
      # Enable Traefik
      - "traefik.enable=true"

      # Dashboard
#      - "traefik.http.routers.dashboard.entrypoint=websecure" # !! If enabled, this line causes a 404 page not found for the dashboard !!
      - "traefik.http.routers.dashboard.rule=Host(`podmanserver16.MYDOMAIN.TLD`) && PathPrefix(`/api` , `/dashboard`)"
      - "traefik.http.routers.dashboard.service=api@internal"

~/config/traefik/traefik.yml is empty


~/config/traefik/dynamic/certificates.yml

tls:
#  stores:
#    default:
#      defaultCertificate:
#        certFile: /certificates/MYDOMAIN.TLD/cert.pem
#        keyFile: /certificates/MYDOMAIN.TLD/privkey.pem
  certificates:
    - certFile: /certificates/MYDOMAIN.TLD/cert.pem
      keyFile: /certificates/MYDOMAIN.TLD/privkey.pem

Crowdsec is disabled at the moment but here you go
~/config/traefik/dynamic/crowdsec-bouncer.yml

#http:
#  middlewares:
#    crowdsec-bouncer:
#      forwardauth:
#        address: http://bouncer-traefik:8080/api/v1/forwardAuth
#        trustForwardHeader: true

And, if it's not sufficiently clear, I must MANUALLY restart traefik.
I played around a bit with Docker events but it didn't really seem to work (actually it might create a loop, where the traefik container triggers a Docker/Podman event, which trigger another restart etc).

I don't remember on which Server I tested that and what it did exactly, but that was the behavior I was observing.

Should it rather be this instead ?

--providers.docker.watch=true

Sure. But I got the impression it’s only relevant for Docker Swarm (doc).

Is there a way you could test this with real Docker to see if it’s a Podman compatibility issue or an issue with Traefik?

I just tried but watch by itself doesn't do anything. The Doc Section doesn't say anything about Docker Swarm though.

I cannot really move everything from Podman to Docker and back ...

Not really a solution but rather a (maybe incomplete) workaround.

Basically always ensure that traefik is the LATEST container to be (re)started.

Note: my containers are named $container and the associated Systemd service is named container-$container.service. For example container-dashy.service is the Systemd service for the dashy container.

  1. Create /home/podman/bin/monitor-traefik.sh:
#!/bin/bash

while true
do
    # List Containers
    mapfile -t list < <( podman ps --all --format="{{.Names}}" )

   # Get current epoch time
   now=$(date +%s)

   # Get past epoch Time in which traefik was started (constant value)
   traefik_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=traefik)

   # Get traefik running duration
   traefik_duration_s=$((now-traefik_startedat))

   # Define if traefik must be restarted
   traefik_restart=0

   for container in "${list[@]}"
   do
       # Get past epoch Time in which the container was started (constant value)
       container_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=$container)

       # Get container running duration
       container_duration_s=$((now-container_startedat))

       # Compare against traefik started time
       if [[ ${traefik_startedat} -lt ${container_startedat} ]]
       then
          echo "Container $container was started AFTER traefik Proxy Server. Restarting Traefik Necessary"
          traefik_restart=1
       fi
   done

   if [[ ${traefik_restart} -gt 0 ]]
   then
      # Restart traefik container
      echo "Restarting traefik container"
      systemctl --user restart container-traefik
   fi

   # Wait a bit
   sleep 15
done
  1. Give execution permissions: chmod +x /home/podman/bin/monitor-traefik.sh

  2. Create a very basic Systemd service if you are using Systemd
    nano ~/.config/systemd/user/monitor-traefik.service

  3. Enter the following

[Unit]
Description=Traefik Monitoring and Restarting Tool

[Service]
ExecStart=/bin/bash -c '~/bin/monitor-traefik.sh'
#ExecStop=/bin/bash -c '~/bin/monitor-traefik-stop.sh'

[Install]
WantedBy=default.target
  1. Reload Systemd Service Files: systemctl --user daemon-reload

  2. Enable the Service to start automatically at each boot: systemctl --user enable monitor-traefik.service

  3. Start the Service: systemctl --user restart monitor-traefik.service

  4. Verify the Status is OK: systemctl --user status monitor-traefik.service

  5. Check the logs from time to time and in case of issues: journalctl --user -xeu monitor-traefik.service

Updated /home/podman/bin/monitor-traefik.sh able to handle container names that otherwise would be flagged as "duplicate" resulting in a script error (e.g. I had peppermint and peppermint_postgres, which would result in an error if I wasn't using the regex ^$container$ in podman ps --all --format="{{.StartedAt}}" --filter name=^$container\$)

#!/bin/bash

while true
do
    # List Containers
    mapfile -t list < <( podman ps --all --format="{{.Names}}" )

    #formatted=""

   # Get current epoch time
   now=$(date +%s)

   # Get past epoch Time in which traefik was started (constant value)
   traefik_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=traefik)

   # Get traefik running duration
   traefik_duration_s=$((now-traefik_startedat))

   # Define if traefik must be restarted
   traefik_restart=0

   for container in "${list[@]}"
   do
       # Echo
       #echo "Processing container <$container>"

       # Get past epoch Time in which the container was started (constant value)
       container_startedat=$(podman ps --all --format="{{.StartedAt}}" --filter name=^$container\$)
       #started=${container_startedat}

       # Get container running duration
       container_duration_s=$((now-container_startedat))

       # Compare against traefik started time
       #echo "if [[ ${traefik_startedat} -lt ${container_startedat} ]]"
       if [[ ${traefik_startedat} -lt ${container_startedat} ]]
       then
          echo "Container $container was started AFTER traefik Proxy Server. Restarting Traefik Necessary"
          traefik_restart=1
       fi

       # Transformer into hours
       #container_duration_h=$(echo "scale=0; ${container_duration_s}/3600" | bc)

       # Format it
       #formatted="${formatted}${container}|${container_duration_s} s|${container_duration_h} h \n"
   done

   #echo ${formatted} | column -t -s$'\t'
   #echo -e ${formatted} | column -t -s "|"


   if [[ ${traefik_restart} -gt 0 ]]
   then
      # Restart traefik container
      echo "Restarting traefik container"
      systemctl --user restart container-traefik
   fi

   # Wait a bit
   sleep 15
done