Traefik looses certificates and request old subdomains

My Traefik configuration got two problems.


First problem: I've rented a new virtual server at 5.1.2024. I've configured Traefik with this configuration (traefik.yml):

api:
  dashboard: true

certificatesResolvers:
  http:
    acme:
      email: "meine@email.de"           # E-Mail changed
      storage: "acme_letsencrypt.json"
      httpChallenge:
        entryPoint: http

entryPoints:
  http:
    address: ":80"
    http:
      redirections:
        entryPoint:
          to: "https"
          scheme: "https"
  https:
    address: ":443"

global:
  checknewversion: true
  sendanonymoususage: false

providers:
  docker:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false
    network: "proxy"
  file:
    filename: "./dynamic_conf.yml"
    watch: true
  providersThrottleDuration: 10

This is the dynamic configuration (dynamic_conf.yml):

# TLS
# Here are all settings for the certificates
# We will get in combination with the settings under http.middlewares.default-security-headers an A+ certificate
tls:
  options:
    default:
      minVersion: VersionTLS12
      cipherSuites:
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
        - TLS_AES_128_GCM_SHA256
        - TLS_AES_256_GCM_SHA384
        - TLS_CHACHA20_POLY1305_SHA256
      curvePreferences:
        - CurveP521
        - CurveP384
      sniStrict: true

# Middlewares
# Optional optimising which will done during every request before it will redirect to the container
http:
  middlewares:
    # A basic authentification middleware to protect the Traefik-Dashboard via htpasswd
    traefikAuth:
      basicAuth:
        users:
          - "benutzer:passworthash"   # Insert here the passwort hash

    # Recommended standard middleware for most services
    # Addable via "traefik.http.routers.definierteRoute.middlewares=default@file"
    default:
      chain:
        middlewares:
          - default-security-headers
          - gzip

    # Compatible to the old manual 
    secHeaders:
      chain:
        middlewares:
          - default-security-headers
          - gzip

    # Standard Header
    default-security-headers:
      headers:
        browserXssFilter: true
        contentTypeNosniff: true
        forceSTSHeader: true
        frameDeny: true
#       Deprecated
#       sslRedirect: true
        #HSTS Configuration
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 31536000
        customFrameOptionsValue: "SAMEORIGIN"
    # Gzip Kompression
    gzip:
      compress: {}

The docker compose file for traefik:

version: '3.9'
services:
  traefik:
    container_name: traefik
    image: traefik:latest
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./data/traefik.yml:/traefik.yml:ro
      - ./data/acme_letsencrypt.json:/acme_letsencrypt.json
      - ./data/dynamic_conf.yml:/dynamic_conf.yml
    labels:
      - "com.centurylinklabs.watchtower.enable=true"
      - "traefik.enable=true"
      - "traefik.http.routers.traefik.entrypoints=https"
      - "traefik.http.routers.traefik.rule=Host(`traefik.meinedomain.de`)"  # Changed
      - "traefik.http.routers.traefik.middlewares=traefikAuth@file,default@file"
      - "traefik.http.routers.traefik.tls=true"
      - "traefik.http.routers.traefik.tls.certresolver=http"
      - "traefik.http.routers.traefik.service=api@internal"
      - "traefik.http.services.traefik.loadbalancer.sticky.cookie.httpOnly=true"
      - "traefik.http.services.traefik.loadbalancer.sticky.cookie.secure=true"
      - "traefik.docker.network=proxy"
    restart: unless-stopped
    security_opt:
      - no-new-privileges:true
    networks:
      proxy:
    hostname: traefik
    ports:
      - "80:80"
      - "443:443"

networks:
  proxy:
    name: proxy
    driver: bridge
    attachable: true

After the Traefik installation, I've created the Nextcloud instance with this configuration:

version: '3.3'
services:
  nextcloud-db:
    image: mariadb
    container_name: nextcloud-db
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: ${POSTGRES_PASSWD}
      MYSQL_USER: nextcloud
      MYSQL_DATABASE: nextcloud
      MYSQL_PASSWORD: ${POSTGRES_PASSWD}
    volumes:
      - mariadb-data:/var/lib/mysql
    expose:
      - 8080

  nextcloud-redis:
    image: redis:alpine
    container_name: nextcloud-redis
    hostname: nextcloud-redis
    networks:
        - default
    volumes:
      - redis-data:/data
    restart: unless-stopped
    command: redis-server --requirepass ${REDIS_PASSWD}

  nextcloud-app:
    image: nextcloud
    container_name: nextcloud-app
    restart: unless-stopped
    depends_on:
      - nextcloud-db
      - nextcloud-redis
    environment:
      TRUSTED_PROXIES: 172.18.0.2/16
      OVERWRITEPROTOCOL: https
      OVERWRITECLIURL: https://drive.${TLD}
      OVERWRITEHOST: drive.${TLD}
      REDIS_HOST: nextcloud-redis
      REDIS_HOST_PASSWORD: ${REDIS_PASSWD}
      MYSQL_HOST: nextcloud-db
      MYSQL_PASSWORD: ${POSTGRES_PASSWD}
      MYSQL_DATABASE: nextcloud
      MYSQL_USER: nextcloud
    volumes:
      - app:/var/www/html
      - data:/var/www/html/data
    labels:
      # Default labels
      - "traefik.enable=true"
      - "traefik.http.routers.nextcloud.entrypoints=https"
      - "traefik.http.routers.nextcloud.rule=Host(`drive.${TLD}`)"
      - "traefik.http.routers.nextcloud.tls=true"
      - "traefik.http.routers.nextcloud.tls.certresolver=http_resolver"
      - "traefik.http.routers.nextcloud.service=nextcloud"
      - "traefik.http.services.nextcloud.loadbalancer.server.port=80"
      - "traefik.docker.network=proxy"
      - "traefik.http.routers.nextcloud.middlewares=nextcloud-dav,default@file"
      - "traefik.http.middlewares.nextcloud-dav.replacepathregex.regex=^/.well-known/ca(l|rd)dav"
      - "traefik.http.middlewares.nextcloud-dav.replacepathregex.replacement=/remote.php/dav/"

    networks:
      - proxy
      - default

networks:
  proxy:
    external: true

volumes:
  mariadb-data:
    name: nextcloud-database
  data:
    name: nextcloud-data
  app:
    name: nextcloud-app
  redis-data:
    name: nextcloud-redis-data

It worked fine for three months, until the first certificate expires. On 4.4.2024, the certificate was renewed and the new one was valid until 3.6.2024. But after a few hours, the new certificate get lost, my browser warns me with an "Did Not Connect: Potential Security Issue" message and if I review the current certificate, the old one is displayed, which was valid until 4.4.2024.

old-certificate

This behavior repeats every time. I've recreate the container, Traefik get the new certificate with validity until 3.6.2024 and after a couple of hours, the new certificate get lost and the old one is available. This behavior happens ONLY with Nextcloud. Every other subdomain/stack/container works perfectly fine.


Second problem: If I've look into the logs of Traefik, there are only errors about the failed ACME challenge like "time="2024-04-11T23:32:01+02:00" level=error msg="Cannot retrieve the ACME challenge for drive.TLD (token "w_UTKgFLS9KatyMNRaSd_tqgtfHNbgSodCVraGSkxsA")" providerName=acme". But these messages came not only for the current subdomain. There are errors for domains on the old server, which are not in use since months! I've don't know, which Traefik request Subdomains, which are no more existing...


I hope, anybody can help me.

Make sure to use an absolute path, and that the file is stored on a bind mount or volume for persistence. I personally prefer a folder mount for Traefik, not individual files. Compare to simple Traefik example.

Old domain names can come from old files, either static or dynamic config, or via Docker configuration discovery from old containers. Are the files on the server or mounted from a shared folder?

But I don't copy any file from the old server to the new one.

Update information: If I ping the page continuously (I have a plugin to check the state of the page), die error doesn't appear. But if the page doesn't get accessed for hours, the certificate get lost....

Check the container (docker ps), is it continuously running? Is watchtower restarting or re-creating it?

Make sure to use absolute path:

I think, I've found a solution. I've stopped the traefik server, deleted the "acme_letsencrypt.json" file, recreate a new one and restarted the Traefik server. Since two days, no issues.

Thanks for your help!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.