Hi! First poster here. I apologize in advance if this has been seen before (I've checked but not all the posts) / if this is the wrong place, feel free to point me elsewhere if so.
I recently upgraded by Kubernetes cluster from traefik 1.7 to 2+. I installed traefik 2+ using the containous helm chart with the default values (https://docs.traefik.io/getting-started/install-traefik/#use-the-helm-chart) - from my understanding this installation method created CRDs for k8s that better map to Traefik concepts. Fine with me, I migrated my routes to the new IngressRoute definitions and they worked, I was able to reach my services just fine through Traefik.
Now I've been trying to setup HTTPS, but for some reason traefik always fails the TLS and HTTP challenges (DNS not an option right now / don't need it ). I know traffic gets to the destination service because I can curl / visit the https route by trusting the dummy cert that traefik generates but for some reason letsencrypt can't verify?
Here's the ingress route def:
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: traefik-v2-dashboard
namespace: kube-system
spec:
entryPoints:
- websecure
routes:
- match: Host(`traefik.mydomain.com`)
kind: Rule
services:
- kind: TraefikService
name: api@internal
tls:
certResolver: myresolver
---
and the traefik deployment:
spec:
containers:
- args:
- --global.checknewversion
- --global.sendanonymoususage
- --entryPoints.traefik.address=:9000
- --entryPoints.web.address=:80
- --entryPoints.websecure.address=:443
- --api.dashboard=true
- --ping=true
- --providers.kubernetescrd
- --certificatesresolvers.myresolver.acme.email=engineering@mydomain.com
- --certificatesresolvers.myresolver.acme.storage=acme.json
- --certificatesresolvers.myresolver.acme.tlschallenge
- --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencry pt.org/directory
image: traefik:2.2.0
name: traefik-v2
ports:
- containerPort: 9000
name: traefik
protocol: TCP
- containerPort: 80
name: web
protocol: TCP
- containerPort: 443
name: websecure
protocol: TCP
and the errors I've been getting for HTTP Challenge (TLS are similar)
time="2020-04-20T14:15:04Z" level=error msg="Unable to obtain ACME certificate for domains \"traefik.mydomain.com\": unable to generate a certificate for the domains [traefik.mydomain.com]: acme: Error -> One or more domains had a problem:\n[traefik.mydomain.com] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Fetching http://traefik.mydomain.com/.well-known/acme-challenge/mjJQBITgUNfDxRR5am798R-p3DhqdTVJ97VWjGlHzUM: Connection refused, url: \n" rule="Host(`traefik.mydomain.com`)" providerName=myresolver.acme routerName=kube-system-traefik-v2-dashboard-541f1f6c7cdf8c56d30d@kubernetescrd
For context, I got the domain from google domains and have the DNS managed by netlify, where I forward the relevant subdomains to the GKE cluster. It's a roundabout configuration but it works. Also, I'm running in a non-HA configuration - one traefik container on one node is receiving the requests and dispatching.
Does anyone know what my problem may be or how I get more visibility into that?