Automatic certificate renewal stopped working after Trafik 2.11 to 3.1 update

Hello everyone,

We have been using traefik successfully for a very long time. We have both on our staging env and also production env, several docker containers and a "main traefik container" running.
We have not changed anything about the acme setup in many years!

Recently, we did an update from traefik 2.11 to 3.1 and it seems like, somehow, the automatic refresh of the certificates has stopped working.
I now recreated a new acme.json file with chmod 600 and restarted the traefik container - no issues! The new certificate could be retrieved without any issues and is now valid 3 months from now. So there is no issue with our DNS or setup or anything.

But the "old acme.json" only had an outdated cert for one of our domains - a restart of traefik and the container in question did NOT make it a valid certificate.
What is going on? I compared the 2 acme.json files and can't find any meaningful differences.

I am now afraid that we will run into the same issue on our prod env, which obviously, can not happen!

Here is part of our config:

services:
  traefik:
    image: traefik:v3.1
    container_name: traefik
    command:
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --providers.docker=true
      - --providers.docker.watch=true
      - --providers.docker.exposedbydefault=false
      - --api=true
      - --api.dashboard=true
      - --certificatesresolvers.le.acme.email=foo@bar.com
      - --certificatesresolvers.le.acme.storage=/acme.json
      - --certificatesresolvers.le.acme.httpchallenge.entrypoint=web
      - --certificatesresolvers.le.acme.tlschallenge=true
      - --log=true
      - --log.filepath=/var/log/traefik.log
      - --log.level=WARN
    ports:
      - 80:80
      - 443:443
    labels:
      - traefik.enable=true
      - traefik.http.routers.api.entrypoints=web
      - traefik.http.routers.api.rule=Host(`stagingtraefik.orderlion.com`)
      - traefik.http.routers.api.service=api@internal
      - traefik.http.routers.api.middlewares=auth
      - traefik.http.middlewares.auth.basicauth.users=.....
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
      - './acme.json:/acme.json'
      - '/var/log:/var/log'
    networks:
      - web
    restart: unless-stopped

  orderlion-staging:
    image: ${OLIMAGE}
    container_name: orderlion-staging
    environment:
      - ...
    labels:
      - traefik.enable=true
      - traefik.http.routers.orderlion-staging.rule=Host(`${HOST:-staging.orderlion.com}`) || Host(`stagingbak.orderlion.com`)
      - traefik.http.routers.orderlion-staging.tls=true
      - traefik.http.routers.orderlion-staging.tls.certresolver=le
      - traefik.http.routers.orderlion-staging.entrypoints=websecure
      - traefik.http.routers.orderlion-staging-http.entrypoints=web
      - traefik.http.routers.orderlion-staging-http.rule=Host(`${HOST:-staging.orderlion.com}`) || Host(`stagingbak.orderlion.com`)
      - traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https
      - traefik.http.middlewares.redirect-to-https.redirectscheme.permanent=true
      - traefik.http.routers.orderlion-staging-http.middlewares=redirect-to-https
      - traefik.http.services.orderlion-staging.loadbalancer.server.port=3000
      - traefik.http.services.orderlion-staging.loadbalancer.sticky=true
      - traefik.http.services.orderlion-staging.loadbalancer.sticky.cookie=true
      - traefik.http.services.orderlion-staging.loadbalancer.sticky.cookie.name=traefiksession
    volumes:
      - /files:/files
    depends_on:
      - traefik
    networks:
      - web
    restart: unless-stopped

Could it be an issue with the way we define the Host() of the container?

Can someone help? :slight_smile: is there any mistake or a change I missed in my traefik update?

Thanks! best,
Patrick

With Traefik v3, Host() only accepts a single domain and the RegEx changed. Could that be the reason?

I can confirm that we had to change the Host().

Traefik v2.11: we had this: Host('foo.com', 'bar.com')

Traefik v3.1: we now have this: Host('foo.com') || Host('bar.com')

But this works beautifully!
Could it be an issue with us using an env var inside the Host()?

labels:
  - traefik.http.routers.orderlion-staging.rule=Host(`${HOST:-staging.orderlion.com}`) || Host(`stagingbak.orderlion.com`)

We use this setup to be able to parse a manual host when calling docker compose up ..., but also have a default value.

But when running a docker inspect, this gets resolved at it should! :slight_smile:

"traefik.http.routers.orderlion-staging-http.rule": "Host(`staging.orderlion.com`) || Host(`stagingbak.orderlion.com`)"

Also: how can this really be the issue, if a creating a new acme.json and restarting both containers resulted in successful / valid certs? So in general it all works as it should, but the auto renewal seems to fail?!