How to debug the cert being served in k8s

I have an IngressRoute that is referencing a cert that I have updated as the previous one had expired. I've restarted Traefik multiple times and it will serve the correct one for a bit, but it still randomly reverts back to serving the old expired cert and sometimes restarting serves the new one again. This is a manually provided cert, not let's encrypt.

I see the following line once in the logs, but it never repeats so I'm not sure if its just a temporary error while the pod loads or not.

time="2023-10-03T19:13:21Z" level=error msg="Error configuring TLS: secret application/wildcard-tls does not exist" namespace=application providerName=kubernetescrd ingress=applicant

This secret DOES exist however! The Traefik Service/ClusterRole/ClusterRoleBinding all also appear to be correct as far as I can see.

Can anyone offer any advice? This is still an issue for us :frowning:

Hi @bnason, thanks for your interest in Traefik!

Could you please share your configuration?

Thanks in advance!

log:
  level: DEBUG
accessLog:
  format: clf
entryPoints:
  traefik:
    address: ":7000"
  ping:
    address: ":7001"
  http-internet:
    address: ":8000"
  http-extranet:
    address: ":8001"
  http-intranet:
    address: ":8002"
  https-internet:
    address: ":9000"
  https-extranet:
    address: ":9001"
  https-intranet:
    address: ":9002"
ping:
  entryPoint: ping
providers:
  kubernetescrd: {}
  kubernetesIngress: {}
tls:
  options:
    default:
      sniStrict: true
      minVersion: VersionTLS13
      cipherSuites:
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
        - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
        - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
api:
  dashboard: true
  insecure: true
metrics:
  prometheus:
    addEntryPointsLabels: true
    addServicesLabels: true
    entryPoint: traefik

If you are running k8s I would expect a lot more configs :rofl:

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: project
  namespace: project
spec:
  entryPoints:
    - https-internet
    - https-extranet
    - https-intranet
  routes:
    - kind: Rule
      match: Host(`sub.example.com`)
      services:
        - name: project
          port: 80
  tls:
    secretName: wildcard-tls
apiVersion: v1
kind: Service
metadata:
  name: project
  namespace: project
  labels:
    app: project
spec:
  type: ClusterIP
  selector:
    app: project
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80

apiVersion: apps/v1
kind: Deployment
metadata:
  name: project
  namespace: project
  labels:
    app: project
spec:
  selector:
    matchLabels:
      app: project
  template:
    metadata:
      labels:
        app: project
    spec:
      containers:
        - name: project
          image: registry.example.com/project:v1.1.3
      restartPolicy: Always
> kubectl --namespace project describe service project
Name:              project
Namespace:         project
Labels:            app=project
Annotations:       <none>
Selector:          app=project
Type:              ClusterIP
IP Families:       <none>
IP:                10.43.105.147
IPs:               <none>
Port:              http  80/TCP
TargetPort:        80/TCP
Endpoints:         10.42.4.236:80
Session Affinity:  None
Events:            <none>

> kubectl --namespace project get secret wildcard-tls -o json | jq -r '.data.["tls.crt"]' | base64 --decode | openssl x509 -enddate -noout
notAfter=Feb 24 20:37:49 2024 GMT

Anything else I need to provide?

Hi @bluepuma77

Your log says namespace=application:

And the other files use namespace=project ?

Did you see How Traefik is Storing and Serving TLS Certificates?

Did you confirm that you have only the latest cert in your K8s secrets and that you do not reference the old cert somewhere?

Sorry, that was an error in obfuscating the texts at different times, both namespaces are the same.

Does this mean that Traefik can potentially use a cert OTHER than the one directly referenced in the IngressRoute?

The cert referenced in my IngressRoute is definitely the latest one. I believe I've updated the cert everywhere else in my cluster, but it's a bit disconcerting if its using a cert other than the one I explicitly told it to.

For each incoming connection, Traefik is serving the "best" matching TLS certificate for the provided server name.

If Traefik still finds the old cert, it will prefer the old cert over the new one.
The best way to make sure that Traefik is using the new cert is to remove the old cert completely from the cluster.