I’m encountering an issue with Traefik failing to generate ACME certificates after a VM restart caused my external IP address to change. I have updated the DNS records to reflect the new external IP, but now I’m getting connection timeouts when trying to reach the ACME challenge URL from the outside.
Environment Details:
Traefik Version: 3.1.5
Deployed Using: Docker Compose
Cloud Platform: Google Cloud (GCE)
External IP: 35.136.188.182
DNS Configuration: Updated to point to the new IP (confirmed propagated)
Ports Open: Verified that ports 80 and 443 are allowed through the firewall on GCP
Problem:
ACME certificate generation fails with the following error:
ERR Unable to obtain ACME certificate for domains error="unable to generate a certificate for the domains [client.donclemtech.com]: error: one or more domains had a problem: acme: error: 400 :: urn:ietf:params:acme:error:connection :: Fetching http://client.donclemtech.com/.well-known/acme-challenge/...: Timeout during connect (likely firewall problem)"
When I run curl http://client.donclemtech.com, it times out:
* Trying 35.136.188.182:80...
* connect to 35.136.188.182 port 80 failed: Connection timed out
However, when I curl 35.136.188.182:80 directly (using the IP), it works and reflects in the Traefik logs.
What I Have Tried:
DNS Propagation: Confirmed that DNS points to the correct external IP (dig resolves properly).
Firewall: Verified GCP firewall allows ingress on ports 80 and 443.
Traefik Configuration:
Traefik is configured to handle HTTP-01 challenges via port 80.
The problem started after my VM was restarted, causing the external IP to change. DNS has been updated, but the ACME challenge still fails with timeouts on port 80.
I suspect the issue may be related to GCP network routing or some unknown configuration, as the firewall and DNS seem correct, and Traefik’s logs show it is responding locally.
Request for Help:
What could be preventing external access to port 80, even though the firewall is configured correctly?
Are there any additional settings in GCP or Traefik that I might need to check?
Could there be an issue with Traefik routing requests based on the domain after the IP change?
due to my impatience, I deleted the generated previous certificate in my docker container and after no effect I delete the own volume of traefik.
After all of those, nothing still works out. I suspected that because I upgraded to 3.1.6 so that is why I downgraded to 3.1.5. Please, I need urgent help in this because my company website has been down since yesterday because of this.
curl -v http://34.136.188.182
* Trying 34.136.188.182:80...
* Connected to 34.136.188.182 (34.136.188.182) port 80
> GET / HTTP/1.1
> Host: 34.136.188.182
> User-Agent: curl/8.5.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Fri, 18 Oct 2024 04:48:46 GMT
< Content-Length: 19
404 page not found
Domain Access Resolves but Times Out (client.donclemtech.com):
curl -v http://client.donclemtech.com
* Host client.donclemtech.com:80 was resolved.
* IPv6: (none)
* IPv4: 35.136.188.182
* Trying 35.136.188.182:80...
* connect to 35.136.188.182 port 80 from 192.168.0.192 port 52398 failed: Connection timed out
* Failed to connect to client.donclemtech.com port 80 after 134273 ms: Couldn't connect to server
Other Domains Also Time Out (task.donclemtech.com):
curl -v http://task.donclemtech.com
* Host task.donclemtech.com:80 was resolved.
* IPv6: (none)
* IPv4: 35.136.188.182
* Trying 35.136.188.182:80...
* connect to 35.136.188.182 port 80 from 192.168.0.192 port 54532 failed: Connection timed out
* Failed to connect to task.donclemtech.com port 80 after 137761 ms: Couldn't connect to server
when I stop my traefik this will be the result:
curl -v http://34.136.188.182
* Trying 34.136.188.182:80...
* connect to 34.136.188.182 port 80 from 192.168.0.192 port 49698 failed: Connection refused
* Failed to connect to 34.136.188.182 port 80 after 206 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to 34.136.188.182 port 80 after 206 ms: Couldn't connect to server
To show that Traefik work, when I run curl -v http://34.136.188.182, I will get this in my traefik log
traefik-1 | 2024-10-18T04:56:38Z ERR Error while Peeking first byte error="read tcp 172.18.0.9:80->167.94.145.110:36070: read: connection reset by peer"