Error with Traefik's DNS challenge while setting up Let's Encrypt

Hi Team,
I’m fairly new to Docker Swarm and Traefik, so I’d appreciate any guidance or tips as I navigate through this

I’m facing an issue while deploying Traefik in a Docker Swarm environment and could really use your help. In my setup, I am trying to implement a Let's Encrypt certificate with the DNS challenge in Traefik.

Here’s the situation:

When deploying Traefik as a standalone Docker container using a GANDIV5_API_KEY token, everything works perfectly.

However, when I use the same token to deploy Traefik in Docker Swarm, the deployment fails with a 403 error stating that "the token doesn't have enough privileges."

What I’ve checked so far:

  • The token is valid and has the required permissions.
    
  • The runtime variables are correctly set in the Traefik container in both the standalone and Swarm setups.
    

Given this, I’m wondering if there are additional configurations, permissions, or Swarm-specific steps required to resolve this issue.

If anyone has encountered a similar issue or has insights on what might be causing this, your support and guidance would be greatly appreciated.

my docker-swarm file

version: '3.3'

services:
  traefik:
    image: 'traefik:v2.10' # Updated to the latest stable version
    command:
      - --log.level=INFO
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --providers.docker
      - --providers.docker.exposedbydefault=false
      - --providers.docker.swarmmode=true
      - --providers.docker.network=traefik-public
      - --api
      - --api.dashboard=true
      - --accesslog=true
      - --accesslog.filepath=/var/log/traefik/access.log
      - --log.level=DEBUG
      - --log.filepath=/var/log/traefik/traefik.log
      - --certificatesresolvers.certresolver.acme.email=admins@example.com
      - --certificatesresolvers.certresolver.acme.storage=/letsencrypt/acme.json
      - --certificatesresolvers.certresolver.acme.dnschallenge=true
      - --certificatesresolvers.certresolver.acme.dnschallenge.provider=gandiv5
      - --certificatesresolvers.certresolver.acme.dnschallenge.resolvers=217.70.185.65:53,8.8.8.8:53 #Added GandiDNS Severs
      - --certificatesresolvers.certresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
    environment:
      GANDIV5_API_KEY: ${GANDIV5_API_KEY}
    ports:
      - '80:80'
      - '443:443'
    networks:
      - traefik-public
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
      - '/opt/swarm/acme/acme.json:/letsencrypt/acme.json'
    deploy:
      labels:
        - 'traefik.enable=true'
        - 'traefik.http.routers.traefik.rule=Host(`traefik.example.com`)'
        - 'traefik.http.routers.traefik.service=api@internal'
        - 'traefik.http.services.traefik.loadbalancer.server.port=8080'
        - 'traefik.http.routers.traefik.tls.certresolver=certresolver'
        - 'traefik.http.routers.traefik.entrypoints=websecure'
        - 'traefik.http.routers.traefik.middlewares=authtraefik'
        - 'traefik.http.middlewares.authtraefik.basicauth.users=$USENRMAE:PASSWROD'
        - 'traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)'
        - 'traefik.http.routers.http-catchall.entrypoints=web'
        - 'traefik.http.routers.http-catchall.middlewares=redirect-to-https'
        - 'traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https'

networks:
  traefik-public:
    external: true

Is this an error preventing Traefik to be deployed or an error about LetsEncrypt?

Maybe share more debug log.

If you start fresh, why not use Traefik v3? Check simple Traefik Swarm example.

Hi @bluepuma77
Thank you very much for your prompt response. This error occurred during the certificate generation phase. As you suggested, I tried with Traefik version 3, and it also failed. I didn't get any specific error code, but the certificates were not generated.

Inside the container I can see only following log lines:

2024-12-16T05:32:40Z INF Starting provider aggregator aggregator.ProviderAggregator
2024-12-16T05:32:40Z INF Starting provider *traefik.Provider
2024-12-16T05:32:40Z INF Starting provider *docker.SwarmProvider
2024-12-16T05:32:40Z INF Starting provider *acme.ChallengeTLSALPN
2024-12-16T05:32:40Z INF Starting provider *acme.Provider
2024-12-16T05:32:40Z INF Testing certificate renew... acmeCA=https://acme-v02.api.letsencrypt.org/directory providerName=myresolver.acme
2024-12-16T05:42:40Z WRN A new release of Traefik has been found: 3.2.2. Please consider updating.

When I checked the logs docker service and container log don't see any logs there.

I used following config file to deploy version 3.

version: '3'

services:
  traefik:
    image: traefik:v3.1
    hostname: '{{.Node.Hostname}}'
    ports:
      - target: 80
        published: 80
        protocol: tcp
        mode: host
      - target: 443
        published: 443
        protocol: tcp
        mode: host
    networks:
      - proxy
    environment:
      GANDIV5_API_KEY: ${GANDIV5_API_KEY}
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
      - '/opt/swarm/acme/acme.json:/letsencrypt/acme.json'
    command:
      - --api.dashboard=true
      - --log.level=INFO
      - --log.filepath=/var/log/traefik.log
      - --accesslog=true
      - --accesslog.filepath=/var/log/traefik-access.log
      - --providers.swarm.exposedByDefault=false
      - --providers.swarm.network=proxy
      - --entrypoints.web.address=:80
      - --entrypoints.web.http.redirections.entrypoint.to=websecure
      - --entrypoints.web.http.redirections.entrypoint.scheme=https
      - --entrypoints.websecure.address=:443
      - --entrypoints.websecure.http.tls.certresolver=myresolver
      - --certificatesresolvers.myresolver.acme.email=admin@example.com
      - --certificatesresolvers.myresolver.acme.storage=/letsencrypt/acme.json
      - --certificatesresolvers.myresolver.acme.dnschallenge.provider=gandiv5
        #- --certificatesresolvers.myresolver.acme.dnschallenge.resolvers=217.70.185.65:53,8.8.8.8:53
        #- --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.role == leader
      labels:
        - traefik.enable=true
        - traefik.http.routers.mydashboard.rule=Host(`traefik.example.com`)
        - traefik.http.routers.mydashboard.service=api@internal
        - traefik.http.routers.mydashboard.middlewares=myauth
        - traefik.http.routers.mydashboard.tls=true
        - traefik.http.services.mydashboard.loadbalancer.server.port=1337
        - traefik.http.middlewares.myauth.basicauth.users=test:$$apr1$$H6uskkkW$$IgXLP6ewTrSuBkTrqE8wj/

networks:
  proxy:
    name: proxy
    driver: overlay
    attachable: true
    external: true

I checked permission of acme.json file and it's 600

DEBUG LOG

Starting provider *acme.Provider
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/provider/aggregator/aggregator.go:203 > *acme.Provider provider configuration config={"HTTPChallengeProvider":{},"ResolverName":"myresolver","TLSChallengeProvider":{},"caServer":"https://acme-v02.api.letsencrypt.org/directory","certificatesDuration":2160,"dnsChallenge":{"provider":"gandiv5"},"email":"admin.example.com","keyType":"RSA4096","storage":"/letsencrypt/acme.json","store":{}}
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/provider/acme/provider.go:213 > Attempt to renew certificates "720h0m0s" before expiry and check every "24h0m0s" acmeCA=https://acme-v02.api.letsencrypt.org/directory providerName=myresolver.acme
2024-12-16T06:23:46Z INF github.com/traefik/traefik/v3/pkg/provider/acme/provider.go:786 > Testing certificate renew... acmeCA=https://acme-v02.api.letsencrypt.org/directory providerName=myresolver.acme
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/server/configurationwatcher.go:227 > Configuration received config={"http":{"middlewares":{"redirect-web-to-websecure":{"redirectScheme":{"permanent":true,"port":"443","scheme":"https"}}},"models":{"websecure":{"tls":{"certResolver":"myresolver"}}},"routers":{"web-to-websecure":{"entryPoints":["web"],"middlewares":["redirect-web-to-websecure"],"priority":9223372036854775806,"rule":"HostRegexp(`^.+$`)","ruleSyntax":"v3","service":"noop@internal"}},"serversTransports":{"default":{"maxIdleConnsPerHost":200}},"services":{"api":{},"dashboard":{},"noop":{}}},"tcp":{"serversTransports":{"default":{"dialKeepAlive":"15s","dialTimeout":"30s"}}},"tls":{},"udp":{}} providerName=internal
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/server/configurationwatcher.go:227 > Configuration received config={"http":{},"tcp":{},"tls":{},"udp":{}} providerName=myresolver.acme
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/provider/docker/pswarm.go:93 > Provider connection established with docker 27.3.1 (API 1.47) providerName=swarm
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/provider/docker/config.go:184 > Filtering disabled container container=portainer-agent-1yly30f3gdahb1r0tgvkj0o4x providerName=swarm
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/provider/docker/config.go:184 > Filtering disabled container container=portainer-agent-f7219thcj7ro871m0o3w559um providerName=swarm
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/provider/docker/config.go:184 > Filtering disabled container container=portainer-agent-tt8gcm25irngl5n7rrjkit07z providerName=swarm
2024-12-16T06:23:46Z DBG github.com/traefik/traefik/v3/pkg/server/configurationwatcher.go:227 > Configuration received config={"http":{"middlewares":{"myauth":{"basicAuth":{"users":["test:$apr1$H6uskkkW$IgXLP6ewTrSuBkTrqE8wj/"]}}},"routers":{"mydashboard":{"middlewares":["myauth"],"rule":"Host(`traefik.example.com`)","service":"api@internal","tls":{}}},"services":{"mydashboard":{"loadBalancer":{"passHostHeader":true,"responseForwarding":{"flushInterval":"100ms"},"servers":[{"url":"http://10.0.5.177:1337"}]}}}},"tcp":{},"tls":{},"udp":{}} providerName=swarm
2024-12-16T06:23:47Z DBG github.com/traefik/traefik/v3/pkg/tls/tlsmanager.go:321 > No default certificate, fallback to the internal generated certificate tlsStoreName=default

Did you deploy to Swarm with docker stack deploy?

@bluepuma77
Yes, I used following command
docker stack deploy -c new.yaml li_v1

Maybe try docker-swarm-traefik-dnschallenge example and use your dnschallenge.provider (doc), that works for me.

1 Like

Thank you very much for your response. I was trying to deploy a simple Nginx server to verify the integration is working fine, but Let's Encrypt is not generating the certificate. Could you please advise me on this? I don't see any specific errors, and the dashboard is working fine, but the external deployment is not working.


services:
  nginx:
    image: nginx:latest
    networks:
      - traefik
    restart: always
    labels:
      - traefik.enable=true
      - traefik.http.routers.nginx.rule=Host(`nginx-example.com`)
      - traefik.http.services.nginx.loadbalancer.server.port=80
      - traefik.http.routers.nginx.tls.certresolver=certresolver
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.labels.role == leader

networks:
  traefik:
    name: traefik
    driver: overlay
    attachable: true
    external: true

In the traefik logs can see following
2024-12-16T11:28:59Z DBG github.com/traefik/traefik/v3/pkg/tls/tlsmanager.go:228 > Serving default certificate for request: "nginx-example.com

Labels in Swarm need to go inside deploy: section.

Therefore, if you use a compose file with Swarm Mode, labels should be defined in the deploy part of your service.

Doc

1 Like

@bluepuma77
I was able to resolve the issue and keep things on track—thank you so much!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.