Hi! First poster here. I apologize in advance if this has been seen before (I've checked but not all the posts) / if this is the wrong place, feel free to point me elsewhere if so.
I recently upgraded by Kubernetes cluster from traefik 1.7 to 2+. I installed traefik 2+ using the containous helm chart with the default values (https://docs.traefik.io/getting-started/install-traefik/#use-the-helm-chart) - from my understanding this installation method created CRDs for k8s that better map to Traefik concepts. Fine with me, I migrated my routes to the new IngressRoute definitions and they worked, I was able to reach my services just fine through Traefik.
Now I've been trying to setup HTTPS, but for some reason traefik always fails the TLS and HTTP challenges (DNS not an option right now / don't need it ). I know traffic gets to the destination service because I can curl / visit the https route by trusting the dummy cert that traefik generates but for some reason letsencrypt can't verify?
Here's the ingress route def:
apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: traefik-v2-dashboard namespace: kube-system spec: entryPoints: - websecure routes: - match: Host(`traefik.mydomain.com`) kind: Rule services: - kind: TraefikService name: api@internal tls: certResolver: myresolver ---
and the traefik deployment:
spec: containers: - args: - --global.checknewversion - --global.sendanonymoususage - --entryPoints.traefik.address=:9000 - --entryPoints.web.address=:80 - --entryPoints.websecure.address=:443 - --api.dashboard=true - --ping=true - --providers.kubernetescrd - --email@example.com - --certificatesresolvers.myresolver.acme.storage=acme.json - --certificatesresolvers.myresolver.acme.tlschallenge - --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencry pt.org/directory image: traefik:2.2.0 name: traefik-v2 ports: - containerPort: 9000 name: traefik protocol: TCP - containerPort: 80 name: web protocol: TCP - containerPort: 443 name: websecure protocol: TCP
and the errors I've been getting for HTTP Challenge (TLS are similar)
time="2020-04-20T14:15:04Z" level=error msg="Unable to obtain ACME certificate for domains \"traefik.mydomain.com\": unable to generate a certificate for the domains [traefik.mydomain.com]: acme: Error -> One or more domains had a problem:\n[traefik.mydomain.com] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Fetching http://traefik.mydomain.com/.well-known/acme-challenge/mjJQBITgUNfDxRR5am798R-p3DhqdTVJ97VWjGlHzUM: Connection refused, url: \n" rule="Host(`traefik.mydomain.com`)" providerName=myresolver.acme routerName=kube-system-traefik-v2-dashboard-541f1f6c7cdf8c56d30d@kubernetescrd
For context, I got the domain from google domains and have the DNS managed by netlify, where I forward the relevant subdomains to the GKE cluster. It's a roundabout configuration but it works. Also, I'm running in a non-HA configuration - one traefik container on one node is receiving the requests and dispatching.
Does anyone know what my problem may be or how I get more visibility into that?