I am quite new to traefik and followed some helpful guides to build a new docker swarm starting with traefik and portainer on the manager node, secured with LE certificates.
After deploying the stacks for traefik and portainer, accessing the websites for traefik dashboard and portainer works.
But obtaining the certificate from LE leaves me with the following error:
error: 400 :: urn:ietf:params:acme:error:connection :: Timeout during read (your server may be slow or overloaded), url: \n" routerName=portainer-secure@docker rule="Host(hostname)" providerName=http.acme
I suggest you go reread the linked post and the parameters that it links to.
The timeout issue is indicative of the portainer router being connected to two networks.
I experienced it myself this week when I mistyped traefik in the traefik.docker.network label.
docker network inspect webproxy confirmed there are 3 members in the network. The container for traefik and portainer and a webproxy-endpoint.
I can also reach the portainer interface on https by accepting the TRAEFIK DEFAULT CERT.
Executing SH in the traefik container I am further able to ping the portainer container by IP and by service name.
So to me it seems they are communicating using that network.
Another question for my understanding:
Does the handling of TLS challenges for letsencrypt really depend on containers that use the hostname? Is it not handeled by traefik itself?
A test authorization for swarm.spicyweb.de to the Let's Encrypt staging service has revealed issues that may prevent any certificate for this domain being issued.
Timeout during read (your server may be slow or overloaded)
Yes, I tried httpChallenge first. It results in the following error:
traefik_local_main.0.894gjy53z4s7@master | time="2020-07-06T11:30:12+02:00" level=error msg="Unable to obtain ACME certificate for domains "...": unable to generate a certificate for the domains [...]: error: one or more domains had a problem:\n[swarm.spicyweb.de] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Fetching http://swarm.spicyweb.de/.well-known/acme-challenge/vqZXGXq5PWC9KIo-lLtdOTC-SQBg6kJMv3IA8IHT3WQ: Timeout after connect (your server may be slow or overloaded), url: \n" providerName=le-tls.acme routerName=portainer-secure@docker rule="Host(swarm.spicyweb.de)"
Is httpChallenge the one to prefer over tlsChallenge?
I also saved my current config to a cloudfolder, only changed username/password and mailaddress.
This way I think I can better keep them in sync with config changes while searching for a solution. https://cloud.fuermann.net/s/YQLzXkynEAHyejB
I like to use tlsChallenge, personal preference.. I have read that sometime port 80 is blocked by providers(ISPs), so this can be an option in that case vs DNS.
Nice!
Is that with httpChallenge or tlsChallenge. Can you confirm both now work, for science?
Edit: I checked via letsdebug. Green for both now.
Sorry to resurrect this thread, but in case anyone else comes across this via Google search, I wanted to (possibly) help them out.
I was having this same problem following this tutorial on a Linode provisioned server. The first time I ran docker stack deploy -c traefik.yml traefik and watched the logs for it to startup, I got
"Unable to obtain ACME certificate for domains \"<mydomain>\": unable to generate a certificate for the domains [<mydomain>]: error: one or more domains had a problem:\n[<mydomain>] acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up A for <mydomain> - check that a DNS record exists for this domain, url: \n" providerName=le.acme routerName=traefik-public-https@docker rule="Host(`<mydomain>`)"
Realized at that point I had forgotten to create the record. I went and did that in my Linode dashboard adding both v4 and v6 IP addresses. I deleted the service and redeployed the stack and started getting:
Unable to obtain ACME certificate for domains \"<mydomain>\": unable to generate a certificate for the domains [<mydomain>]: error: one or more domains had a problem:\n[<mydomain>] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Timeout during read (your server may be slow or overloaded), url: \n" routerName=traefik-public-https@docker providerName=le.acme rule="Host(`<mydomain>`)
After coming across this post, I tried removing the the IPv6 address and removing/redeploying the service but got the same error. At this point, I was a bit stumped, but then went spelunking for more data in the volume. That's when I had the idea to completely delete the Docker volume I had made and then redeploy and that seemed to fix it up.
I'm assuming that the IPv6 was the problem, but it might have been caused by the initial failure causing some bad state in the Docker volume. Since the error message changed after adding the records, I'm guessing it wasn't that.