Traefik v2 + helm v3 + IngressRoute: "Error getting validation data" (let's encrypt)

Hello Everyone, my Traefik Ingress can't get the TLS certificate from let's encrypt :frowning:

I used helm v3 to deploy to k8s:

helm install traefik traefik/traefik
helm upgrade traefik traefik/traefik --values traefik-custom-values.yml

with these user options (values.yml):

#traefik-custom-values.yml
additionalArguments:
  - --certificatesresolvers.le.acme.email=me@fqdn.xx
  - --certificatesresolvers.le.acme.storage=/data/acme.json
  - --certificatesresolvers.le.acme.tlschallenge=true
  - --entrypoints.web.http.redirections.entryPoint.to=:443
  - --entrypoints.web.http.redirections.entryPoint.scheme=https
persistence:
  enabled: true
  path: /data

And the Whoami app.yaml is:

# whoami.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
  name: app-v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: app-v1
  template:
    metadata:
      labels:
        app: app-v1
    spec:
      containers:
        - name: app-v1
          image: containous/whoami:v1.5.0

---
apiVersion: v1
kind: Service
metadata:
  name: app-v1
  labels:
    app: app-v1
spec:
  type: ClusterIP
  ports:
    - port: 80
      name: app-v1
  selector:
    app: app-v1

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: app
  annotations:
    kubernetes.io/ingress.class: traefik
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: app-whoami.fqdn.xx
      http:
        paths:
          - backend:
              serviceName: app-v1
              servicePort: 80

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: app-tls
spec:
  entryPoints:
    - websecure
  routes:
    - kind: Rule
      match: Host(`app-whoami.fqdn.xx`)
      services:
        - name: app-v1
          port: 80
  tls: # This route uses TLS
    certResolver: le # Uses our certificate resolver to get a certificate automatically!

But when doing kubectl logs traefik-xxx it shows:

time="2020-05-28T08:16:02Z" level=info msg="Configuration loaded from flags."
time="2020-05-28T08:16:15Z" level=error msg="Unable to obtain ACME certificate for domains \"app-whoami.fqdn.xx\": unable to generate a certificate for the domains [app-whoami.fqdn.xx]: error: one or more domains had a problem:\n[app-whoami.fqdn.xx] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Error getting validation data, url: \n" providerName=le.acme routerName=default-app-tls-dc08f57e4bf03442f760@kubernetescrd rule="Host(`app-whoami.fqdn.xx`)"

Any idea why Traefik can't get the challenge right?

Thanks a lot for your help!

Ok, enabling log.level=DEBUG shows this error message:

time="2020-05-28T09:46:20Z" level=error msg="The ACME resolver \"le\" is skipped from the resolvers list because: unable to get ACME account: permissions 660 for /data/acme.json are too open, please use 600"

Which is discussed in this Helm github issue: https://github.com/containous/traefik-helm-chart/issues/164

I ended up chmod g-rw /data/acme.json inside the container, and then adding this to values.yml as suggested in the github issue:

podSecurityContext:
  fsGroup: null

And run helm upgrade. That fixed it