Hey, so I have been experiencing this error when trying to set up my LetsEncrypt certificate:
traefik | time="2022-09-24T22:57:40+01:00" level=error msg="Unable to obtain ACME certificate for domains \"*.loki.[REDACTED DOMAIN NAME],loki.[REDACTED DOMAIN NAME]\"" routerName=https-service@file ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory" error="unable to generate a certificate for the domains [*.loki.[REDACTED DOMAIN NAME] [REDACTED DOMAIN NAME]]: error: one or more domains had a problem:\n[*.[REDACTED DOMAIN NAME]] [*.loki.[REDACTED DOMAIN NAME]] acme: error presenting token: cloudflare: failed to find zone [REDACTED DOMAIN NAME].: ListZonesContext command failed: HTTP status 400: Invalid request headers (6003)\n[[REDACTED DOMAIN NAME]] [loki.[REDACTED DOMAIN NAME]] acme: error presenting token: cloudflare: failed to find zone [REDACTED].: ListZonesContext command failed: HTTP status 400: Invalid request headers (6003)\n" providerName=letsEncrypt.acme rule=
Edit: Relevant part of the log
traefik | time="2022-09-26T09:14:49+01:00" level=debug msg="Looking for provided certificate(s) to validate [\"*.loki.[REDACTED DOMAIN]\" \"loki.[REDACTED DOMAIN]\"]..." providerName=letsEncrypt.acme ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory"
traefik | time="2022-09-26T09:14:49+01:00" level=debug msg="No ACME certificate generation required for domains [\"*.loki.[REDACTED DOMAIN]\" \"loki.[REDACTED DOMAIN]\"]." ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory" providerName=letsEncrypt.acme
traefik | time="2022-09-26T09:14:49+01:00" level=debug msg="Looking for provided certificate(s) to validate [\"*.loki.[REDACTED DOMAIN]\" \"loki.[REDACTED DOMAIN]\"]..." providerName=letsEncrypt.acme ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory"
traefik | time="2022-09-26T09:14:49+01:00" level=debug msg="No ACME certificate generation required for domains [\"*.loki.[REDACTED DOMAIN]\" \"loki.[REDACTED DOMAIN]\"]." providerName=letsEncrypt.acme ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory"
traefik | time="2022-09-26T09:14:50+01:00" level=debug msg="Using DNS Challenge provider: cloudflare" providerName=letsEncrypt.acme
traefik | time="2022-09-26T09:14:50+01:00" level=debug msg="legolog: [INFO] [*.loki.[REDACTED DOMAIN], loki.[REDACTED DOMAIN]] acme: Obtaining bundled SAN certificate"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [*.loki.[REDACTED DOMAIN]] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3751467894"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [loki.[REDACTED DOMAIN]] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3751467904"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [*.loki.[REDACTED DOMAIN]] acme: use dns-01 solver"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [loki.[REDACTED DOMAIN]] acme: Could not find solver for: tls-alpn-01"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [loki.[REDACTED DOMAIN]] acme: Could not find solver for: http-01"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [loki.[REDACTED DOMAIN]] acme: use dns-01 solver"
traefik | time="2022-09-26T09:14:51+01:00" level=debug msg="legolog: [INFO] [*.loki.[REDACTED DOMAIN]] acme: Preparing to solve DNS-01"
traefik | time="2022-09-26T09:14:52+01:00" level=debug msg="legolog: [INFO] [loki.[REDACTED DOMAIN]] acme: Preparing to solve DNS-01"
traefik | time="2022-09-26T09:14:52+01:00" level=debug msg="legolog: [INFO] [*.loki.[REDACTED DOMAIN]] acme: Cleaning DNS-01 challenge"
traefik | time="2022-09-26T09:14:53+01:00" level=debug msg="legolog: [WARN] [*.loki.[REDACTED DOMAIN]] acme: cleaning up failed: cloudflare: failed to find zone [REDACTED DOMAIN].: ListZonesContext command failed: HTTP status 400: Invalid request headers (6003) "
traefik | time="2022-09-26T09:14:53+01:00" level=debug msg="legolog: [INFO] [loki.[REDACTED DOMAIN]] acme: Cleaning DNS-01 challenge"
traefik | time="2022-09-26T09:14:54+01:00" level=debug msg="legolog: [WARN] [loki.[REDACTED DOMAIN]] acme: cleaning up failed: cloudflare: failed to find zone [REDACTED DOMAIN].: ListZonesContext command failed: HTTP status 400: Invalid request headers (6003) "
traefik | time="2022-09-26T09:14:54+01:00" level=debug msg="legolog: [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3751467894"
traefik | time="2022-09-26T09:14:55+01:00" level=debug msg="legolog: [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3751467904"
traefik | time="2022-09-26T09:14:55+01:00" level=error msg="Unable to obtain ACME certificate for domains \"*.loki.[REDACTED DOMAIN],loki.[REDACTED DOMAIN]\"" ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory" providerName=letsEncrypt.acme routerName=https-service@file rule= error="unable to generate a certificate for the domains [*.loki.[REDACTED DOMAIN] loki.[REDACTED DOMAIN]]: error: one or more domains had a problem:\n[*.loki.[REDACTED DOMAIN]] [*.loki.[REDACTED DOMAIN]] acme: error presenting token: cloudflare: failed to find zone [REDACTED DOMAIN].: ListZonesContext command failed: HTTP status 400: Invalid request headers (6003)\n[loki.[REDACTED DOMAIN]] [loki.[REDACTED DOMAIN]] acme: error presenting token: cloudflare: failed to find zone [REDACTED DOMAIN].: ListZonesContext command failed: HTTP status 400: Invalid request headers (6003)\n"
Note: I can successfully hit CF's API with the same email and API token.
https https://api.cloudflare.com/client/v4/zones names:[REDACTED DOMAIN] [Auth-headers...]
At first glance, it looks related to the CF token, but that shouldn't be the case as I'm using the Global one (I know it's not recommended). Digging further, I found this thread that mentions Lego needs to be able to perform a SOA DNS query on the domain, so I tried to access the container and did it using nslookup:
/ # nslookup -type=soa adguard.loki.[REDACTED DOMAIN]
Server: 127.0.0.11
Address: 127.0.0.11:53
Non-authoritative answer:
I noticed the image is Alpine based, so I also tried with a pure Alpine one, yielding the same results. Yet, that same command works in a vanilla Ubuntu-based container and the host system. Returning the following:
root@a0d6a19ea344:/# nslookup -type=soa adguard.loki.[REDACTED DOMAIN]
Server: 127.0.0.11
Address: 127.0.0.11#53
Non-authoritative answer:
*** Can't find adguard.loki.[REDACTED DOMAIN]: No answer
Authoritative answers can be found from:
[REDACTED DOMAIN]
origin = nena.ns.cloudflare.com
mail addr = dns.cloudflare.com
serial = 2289543556
refresh = 10000
retry = 2400
expire = 604800
minimum = 3600
Traefik docker-compose.yml
version: '3.3'
networks:
proxied_services:
external: true
services:
traefik:
image: "traefik:v2.9"
container_name: "traefik"
#dns:
#- 1.1.1.1 #Tried both with and without this
networks:
- proxied_services
volumes:
- /home/cubi/docker/network_infra/traefik:/etc/traefik
- /home/cubi/docker/network_infra/traefik/acme:/letsencrypt
- /home/cubi/docker/network_infra/traefik/traefik.yml:/etc/traefik/traefik.yml:ro
- /home/cubi/docker/network_infra/traefik/dynamic_config.yml:/etc/traefik/dynamic_config.yml:ro
#- /var/run/docker.sock:/var/run/docker.sock:ro
#- /etc/localetime:/etc/localtime:ro
ports:
- 80:80
- 8080:8080
- 443:443
environment:
- TZ=[REDACTED]
- CLOUDFLARE_EMAIL=[REDACTED]
- CLOUDFLARE_DNS_API_TOKEN=[REDACTED]
Command to wire up both the alpine and ubuntu containers:
docker run -dit --name ubuntu1 --network proxied_services ubuntu bash
docker run -dit --name alpine1 --network proxied_services alpine ash
I'm no DNS expert, so this might make no sense, but it seems both the Alpine container and Traefik are not following on what is returned by the DNS server? I will dig through with Wireshark tomorrow to compare the query and response.
So my questions are:
- How can I be sure this SOA DNS thing is the actual root cause?
- Is it expected for a pure alpine based container to fail that query?
- On the other hand, I'm not sure if we can equate nslookup failing a query to the system resolver doing the same. Is it supposed to bypass the default resolver entirely, I guess.
Thank you!