DNS provider time out when trying to read TXT record (with and without delay)

From what I can tell traefik is failing to create the TXT record with my DNS service (Digital Ocean) when using the dnsChallenge (assuming it is traefik that creates the record and not letsencrypt I am a little unclear on what actually happens here) and as such the SSL certificates never get created/validated (again unclear on the process here).

The error I am recieving is as follows:

{"level":"info","msg":"Server configuration reloaded on :80","time":"2019-08-23T10:04:05Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-08-23T10:04:05Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-08-23T10:04:05Z"}
{"level":"error","msg":"Unable to obtain ACME certificate for domains \"mydomain.nz\" : unable to generate a certificatefor the domains [mydomain.nz]: acme: Error -\u003e One or more domains had a problem:\n[mydomain.nz] time limit exceeded: last error: NS ns2.digitalocean.com. did not return the expected TXT record [fqdn: mydomain.nz., value: tm_yW519dV22b3bk6oIOhRD0EVjGXBr6MDYwxOaQXmA]: \n","time":"2019-08-23T10:05:09Z"}

I had a similar issue in the past and it was determined that the problem originated from Cloudflare, after playing with my Cloudflare config I eventually gave up and switched to Digital Ocean in an attempt to start from scratch, unfortunately I am now getting the same error.

My traefik.toml is as follows:

debug = true

logLevel = "DEBUG"
defaultEntryPoints = ["https","http"]

[traefikLog]
  filePath = "traefik.log"
  format   = "json"

[accessLog]
  filePath = "access.log"
  format = "json"

[entryPoints]
  [entryPoints.http]
  address = ":80"
    [entryPoints.http.redirect]
    entryPoint = "https"
  [entryPoints.https]
  address = ":443"
  [entryPoints.https.tls]

[retry]

[acme]
email = "myemail@gmail.com"
storage = "acme.json"
caServer = "https://acme-staging-v02.api.letsencrypt.org/directory"
entryPoint = "https"
[acme.dnsChallenge]
  provider = "digitalocean"
  delayBeforeCheck = 0
[[acme.domains]]
   main = "*.mydomain.nz"
   sans = ["mydomain.nz"]

I have tried with a delay of 0 and 600 neither of which have worked, notably I don't seem to get any errors in the logs with the longer delay but truthfully I have no idea what I am looking for.

Hello,

You can use some extra configuration:

Environment Variable Name Description Default (seconds)
DO_POLLING_INTERVAL Time between DNS propagation check 5
DO_PROPAGATION_TIMEOUT Maximum waiting time for DNS propagation 60
DO_TTL The TTL of the TXT record used for the DNS challenge 30

I recommend to increase the value of DO_PROPAGATION_TIMEOUT

Hello,

Cheers for that, I tried adding it like so:

[acme.dnsChallenge]
  provider = "digitalocean"
  delayBeforeCheck = 0
  DO_PROPAGATION_TIMEOUT = 600

Same error but I feel I might be configuring it wrong? I believe this is overriding the lego config? I had a look at the traefik docs but couldn't find an example of doing this.

DO_PROPAGATION_TIMEOUT is an Environment Variable like DO_AUTH_TOKEN.

This option cannot be defined in the traefik.toml file.

You have to set the env var as you set DO_AUTH_TOKEN.

Hey,

Tried that, and I think it is getting closer but it still seems to think the certs are invalid.

Initially nothing seemed to change even with the environment variable set for the traefik docker, so I added a delay before check of 360 to the dnsChallenge (1 minute longer than the delay I had set in the environment variable).

This was the output I got:

{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-08-24T00:17:08Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-08-24T00:17:08Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2019-08-24T00:17:08Z"}
{"level":"debug","msg":"Building ACME client...","time":"2019-08-24T00:17:09Z"}
{"level":"debug","msg":"https://acme-v02.api.letsencrypt.org/directory","time":"2019-08-24T00:17:09Z"}
{"level":"info","msg":"Register...","time":"2019-08-24T00:17:10Z"}
{"level":"debug","msg":"Using DNS Challenge provider: digitalocean","time":"2019-08-24T00:17:10Z"}
{"level":"debug","msg":"Delaying 360000000000 rather than validating DNS propagation now.","time":"2019-08-24T00:17:12Z"}
{"level":"debug","msg":"Certificates obtained for domains [mydomain.nz]","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Configuration received from provider ACME: {}","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Wiring frontend frontend-Host-mydomain-nz-2 to entryPoint https","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Creating backend backend-heimdall-setup","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Adding TLSClientHeaders middleware for frontend frontend-Host-mydomain-nz-2","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Creating load-balancer wrr","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Creating server server-heimdall-ec2771a84e365132605d64ba3ab37537 at http://192.168.1.250:80 with weight 1","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Creating retries max attempts 1","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Creating route route-frontend-Host-mydomain-nz-2 Host:mydomain.nz","time":"2019-08-24T00:23:23Z"}
// ... 
{"level":"debug","msg":"Adding certificate for domain(s) mydomain.nz","time":"2019-08-24T00:23:23Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-08-24T00:23:23Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-08-24T00:23:23Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2019-08-24T00:23:23Z"}
{"level":"debug","msg":"Basic auth failed","time":"2019-08-24T00:25:20Z"}
{"level":"debug","msg":"Basic auth failed","time":"2019-08-24T00:25:25Z"}
{"level":"warning","msg":"A new release has been found: 1.7.14. Please consider updating.","time":"2019-08-24T00:27:09Z"}
{"level":"debug","msg":"Serving default cert for request: \"158.140.236.43\"","time":"2019-08-24T00:28:20Z"}
{"level":"debug","msg":"http: TLS handshake error from 107.178.236.31:30376: remote error: tls: unknown certificate authority","time":"2019-08-24T00:28:20Z"}

The IP in the very last message I have no idea what that is meant to be but it looks like it is failing there?

Just to confirm here is the block for my traefik docker:

  traefik:
    image: traefik:v1.7.12
    command: --web --docker --docker.watch --docker.domain=${DOMAIN} \             --docker.exposedbydefault=false --acme.domains=${DOMAIN}
    container_name: traefik
    hostname: traefik    networks:
      br0:
        ipv4_address: 192.168.1.253
    volumes:      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ${CONFIG}/traefik/acme.json:/acme.json
      - ${CONFIG}/traefik/traefik.log:/traefik.log      
      - ${CONFIG}/traefik/access.log:/access.log
      - ${CONFIG}/traefik/traefik.toml:/etc/traefik/traefik.toml
      - ${CONFIG}/traefik/.htpasswd:/etc/traefik/.htpasswd:ro    
    environment:
      - DO_PROPAGATION_TIMEOUT = 300
      - DO_AUTH_TOKEN=???
    labels:
      traefik.enable: "true"
      traefik.frontend.rule: "Host:monitor.${DOMAIN}"
      traefik.port: "8080"
      traefik.frontend.auth.basic: "${HTPASSWD}"
      com.ouroboros.enable: "true"
    restart: unless-stopped

EDIT:

I update DO_PROPAGATION_TIMEOUT = 300 to be DO_PROPAGATION_TIMEOUT=300 (wasn't sure if it made a differnce) and now I get the following output:

{"level":"debug","msg":"Adding certificate for domain(s) mydomain.nz","time":"2019-08-24T00:49:27Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-08-24T00:49:27Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-08-24T00:49:27Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2019-08-24T00:49:27Z"}
{"level":"warning","msg":"A new release has been found: 1.7.14. Please consider updating.","time":"2019-08-24T00:53:10Z"}
{"level":"debug","msg":"Serving default cert for request: \"\"","time":"2019-08-24T00:55:13Z"}
{"level":"debug","msg":"http: TLS handshake error from 74.82.47.5:64570: tls: client offered an unsupported, maximum protocolversion of 300","time":"2019-08-24T00:56:51Z"}
{"level":"debug","msg":"Serving default cert for request: \"\"","time":"2019-08-24T00:57:16Z"}
{"level":"debug","msg":"http: TLS handshake error from 74.82.47.5:9432: tls: no cipher suite supported by both client and server","time":"2019-08-24T00:57:16Z"}
{"level":"debug","msg":"Serving default cert for request: \"\"","time":"2019-08-24T00:57:47Z"}
{"level":"debug","msg":"http: TLS handshake error from 74.82.47.5:18822: tls: no cipher suite supported by both client and server","time":"2019-08-24T00:57:47Z"}

Cheers.

So I tried updating Traefik just to see what would happen, and I got this error:

{"level":"debug","msg":"Using DNS Challenge provider: digitalocean","time":"2019-08-26T07:55:22Z"}
{"level":"error","msg":"Unable to obtain ACME certificate for domains \"mydomain.nz\" : unable to generate a certificatefor the domains [mydomain.nz]: acme: error: 429 :: POST :: https://acme-v02.api.letsencrypt.org/acme/new-order :: urn:ietf:params:acme:error:rateLimited :: Error creating new order :: too many certificates already issued for exact set of domains:mydomain.nz: see https://letsencrypt.org/docs/rate-limits/, url: ","time":"2019-08-26T07:55:22Z"}

Would this be because I have kept trying startup the traefik container?

Cheers.

So I tired adding:

caServer = "https://acme-staging-v02.api.letsencrypt.org/directory"

To my traefik.toml to get around the issue with the pending letsencrypt authorizations and now I get the following output:

{"level":"info","msg":"Server configuration reloaded on :80","time":"2019-08-27T08:24:25Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-08-27T08:24:25Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-08-27T08:24:25Z"}
{"level":"debug","msg":"Building ACME client...","time":"2019-08-27T08:24:27Z"}
{"level":"debug","msg":"https://acme-staging-v02.api.letsencrypt.org/directory","time":"2019-08-27T08:24:27Z"}
{"level":"info","msg":"Register...","time":"2019-08-27T08:24:27Z"}
{"level":"debug","msg":"Using DNS Challenge provider: digitalocean","time":"2019-08-27T08:24:27Z"}
{"level":"debug","msg":"Delaying 360000000000 rather than validating DNS propagation now.","time":"2019-08-27T08:24:29Z"}
{"level":"debug","msg":"Certificates obtained for domains [mydomain.nz]","time":"2019-08-27T08:30:38Z"}
{"level":"debug","msg":"Configuration received from provider ACME: {}","time":"2019-08-27T08:30:38Z"}
{"level":"debug","msg":"Wiring frontend frontend-Host-mydomain-nz-3 to entryPoint https","time":"2019-08-27T08:30:38Z"}

Which to me looks like it's worked? But if I try and navigate to the domain I get:

NET::ERR_CERT_AUTHORITY_INVALID

Is that potentially because I am using the staging environment? If so does anyone know how to clear pending letsencrypt authorizations?

I think this is still an improvement, before it didn't even appear to have invalid certificates...?

Cheers

Okay, I tried commenting:

caServer = "https://acme-staging-v02.api.letsencrypt.org/directory"

And now it all appears to go through:

"level":"info","msg":"Server configuration reloaded on :80","time":"2019-08-28T08:19:59Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-08-28T08:19:59Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-08-28T08:19:59Z"}
{"level":"debug","msg":"Building ACME client...","time":"2019-08-28T08:20:03Z"}
{"level":"debug","msg":"https://acme-v02.api.letsencrypt.org/directory","time":"2019-08-28T08:20:03Z"}
{"level":"info","msg":"Register...","time":"2019-08-28T08:20:03Z"}
{"level":"debug","msg":"Using DNS Challenge provider: digitalocean","time":"2019-08-28T08:20:04Z"}
{"level":"debug","msg":"Delaying 360000000000 rather than validating DNS propagation now.","time":"2019-08-28T08:20:06Z"}
{"level":"debug","msg":"Certificates obtained for domains [mydomain.nz]","time":"2019-08-28T08:26:18Z"}
{"level":"debug","msg":"Configuration received from provider ACME: {}","time":"2019-08-28T08:26:18Z"}
{"level":"debug","msg":"Wiring frontend frontend-Host-mydomain-nz-4 to entryPoint https","time":"2019-08-28T08:26:18Z"}
{"level":"debug","msg":"Creating backend backend-heimdall-setup","time":"2019-08-28T08:26:18Z"}
{"level":"debug","msg":"Creating load-balancer wrr","time":"2019-08-28T08:26:18Z"}

Yet the certificates still seem to be invalid, I am really confused at this point.

Did you managed to so solve this?