Trying to get automated letsencrypt certificate generation up and running using the dnschallenge and the route53 provider. I have invested the last 48hours into this and I am pretty confident I am using it the right way, i.e. the way it is described in the docs. Just wanted to discuss this here before I open a bug report on Github (in the likely event it is). The thing that makes me pull out my hair is not the fact that I face a bunch of error messages (which could help me to help myself) - it's the fact that specifying the dnsChallenge is totally ignored or has no effect at all.
First of all, here is my setup:
Deployment
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "10"
creationTimestamp: "2019-12-13T12:28:26Z"
generation: 10
labels:
app: operator
chart: operator-0.2.0
heritage: Helm
release: traefik
name: traefik-operator
namespace: kube-system
resourceVersion: "18467616"
selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/traefik-operator
uid: 76309811-d2b6-4e97-b3a5-af00dfe82916
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: operator
release: traefik
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: operator
release: traefik
spec:
containers:
- args:
- --api=true
- --api.dashboard=true
- --api.debug=true
- --api.insecure=true
- --accesslog=false
- --entrypoints.web=true
- --entrypoints.web.address=:8000
- --entrypoints.web.forwardedheaders.insecure=true
- --entrypoints.websecure=true
- --entrypoints.websecure.address=:4443
- --entrypoints.websecure.forwardedheaders.insecure=true
- --entrypoints.traefik=true
- --entrypoints.traefik.address=:8080
- --entrypoints.traefik.forwardedheaders.insecure=true
- --providers.kubernetescrd=true
- --certificatesresolvers.letsencrypt=true
- --certificatesresolvers.letsencrypt.acme.email=my.name@company.ch
- --certificatesresolvers.letsencrypt.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
- --certificatesresolvers.letsencrypt.acme.dnschallenge=true
- --certificatesresolvers.letsencrypt.acme.dnschallenge.provider=route53
- --global.checknewversion=true
- --log=true
- --log.format=common
- --log.level=INFO
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_SERVICE_ACCOUNT
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.serviceAccountName
envFrom:
- secretRef:
name: traefik-operator-secret
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
# AWS_REGION
image: traefik:2.1.1
imagePullPolicy: IfNotPresent
name: traefik
ports:
- containerPort: 8000
name: web
protocol: TCP
- containerPort: 4443
name: websecure
protocol: TCP
- containerPort: 8080
name: admin
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-socket
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: traefik-ingress-controller
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: traefik-operator-config
name: traefik-config
- hostPath:
path: /var/run/docker.sock
type: Socket
name: docker-socket
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2019-12-13T12:28:28Z"
lastUpdateTime: "2019-12-13T12:28:28Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2019-12-13T12:28:26Z"
lastUpdateTime: "2019-12-13T18:00:59Z"
message: ReplicaSet "traefik-operator-98cfbb7d5" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 10
readyReplicas: 1
replicas: 1
updatedReplicas: 1
Service
Furthermore, I have created this LoadBalancer
service to expose traefik to the outside world.
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-12-12T20:38:01Z"
labels:
app: operator
chart: traefik-1.0.0
heritage: Helm
release: traefik
name: traefik-operator-elb
namespace: kube-system
resourceVersion: "18207922"
selfLink: /api/v1/namespaces/kube-system/services/traefik-operator-elb
uid: b68bf118-3776-405f-aff9-c1f973f82e73
spec:
clusterIP: 100.70.65.64
externalTrafficPolicy: Cluster
ports:
- name: web
nodePort: 32506
port: 80
protocol: TCP
targetPort: 8000
- name: websecure
nodePort: 32541
port: 443
protocol: TCP
targetPort: 4443
selector:
app: operator
release: traefik
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- hostname: ab68bf1113736405faff9c1f973f82t7-1775464521.eu-central-1.elb.amazonaws.com
IngressRoute
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
creationTimestamp: "2019-12-13T15:41:18Z"
generation: 3
labels:
app: operator
chart: traefik-1.0.0
heritage: Helm
release: traefik
name: traefik-operator-dashboard
namespace: default
resourceVersion: "18467054"
selfLink: /apis/traefik.containo.us/v1alpha1/namespaces/default/ingressroutes/traefik-operator-dashboard
uid: cd4cf195-c7c9-4884-86af-7bbe2b3a3123
spec:
entryPoints:
- websecure
routes:
- kind: Rule
match: Host(`company.ch`)
services:
- kind: TraefikService
name: api@internal
tls:
certResolver: letsencrypt
Logs
Ok, now let's have a look at the logs:
time="2019-12-13T18:00:58Z" level=info msg="Configuration loaded from flags."
time="2019-12-13T18:00:58Z" level=info msg="Traefik version 2.1.1 built on 2019-12-12T19:01:37Z"
time="2019-12-13T18:00:58Z" level=info msg="\nStats collection is disabled.\nHelp us improve Traefik by turning this feature on :)\nMore details on: https://docs.traefik.io/v2.0/contributing/data-collection/\n"
time="2019-12-13T18:00:58Z" level=info msg="Starting provider aggregator.ProviderAggregator {}"
time="2019-12-13T18:00:58Z" level=info msg="Starting provider *acme.Provider {\"email\":\"my.name@company.ch\",\"caServer\":\"https://acme-staging-v02.api.letsencrypt.org/directory\",\"storage\":\"acme.json\",\"keyType\":\"RSA4096\",\"dnsChallenge\":{\"provider\":\"route53\"},\"ResolverName\":\"letsencrypt\",\"store\":{},\"ChallengeStore\":{}}"
time="2019-12-13T18:00:58Z" level=info msg="Testing certificate renew..." providerName=letsencrypt.acme
time="2019-12-13T18:00:58Z" level=info msg="Starting provider *crd.Provider {}"
time="2019-12-13T18:00:58Z" level=info msg="label selector is: \"\"" providerName=kubernetescrd
time="2019-12-13T18:00:58Z" level=info msg="Creating in-cluster Provider client" providerName=kubernetescrd
time="2019-12-13T18:00:58Z" level=info msg="Starting provider *traefik.Provider {}"
I turned on all logs and debug flags that I could find. But no additional info, no mention about "certificate generation". Elsewhere I have seen that lego-logs are also printed to stdout, but that's not the case here. So I wonder, why lego-logs no longer appear.
Any help or pointers is much appreciated
Thank you