Hello,
I am using k3s cluster with traefik as the ingress. I had an interesting situation this morning when I was notified that all my services are down with 404 error. I checked the logs and I saw this:
<serving traffic as usual>
{"level":"info","msg":"Traefik version 2.9.4 built on 2022-10-27T18:44:34Z","time":"2023-02-28T06:59:36Z"}
{"level":"info","msg":"\nStats collection is disabled.\nHelp us improve Traefik by turning this feature on :)\nMore details on: https://doc.traefik.io/traefik/contributing/data-collection/\n","time":"2023-02-28T06:59:36Z"}
{"error":"failed to download plugin github.com/soulbalz/traefik-real-ip: failed to call service: Get \"https://plugins.traefik.io/public/download/github.com/soulbalz/traefik-real-ip/v1.0.3\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)","level":"error","msg":"Plugins are disabled because an error has occurred.","time":"2023-02-28T06:59:41Z"}
{"entryPointName":"websecure","level":"error","msg":"invalid middleware \"kube-system-traefik-real-ip@kubernetescrd\" configuration: invalid middleware type or middleware does not exist","routerName":"service-name-ce29b9a8df15f4c7cd88@kubernetescrd","time":"2023-02-28T06:59:42Z"}
I could fix this by simply killing the pod but in the meantime all services were down because all of them are using one plugin or another.
I have a couple of questions regarding this issue:
-
why was config suddenly reloaded, the pod was up for 28 days and this has never happened before, there is nothing in the logs to suggest a problem, the pod did not die
-
is there a way to make healthcheck fail in case of failed plugin download so kubernetes will kill the pod until plugin download is successful?
My config:
- --entrypoints.metrics.address=:9101/tcp
- --entrypoints.traefik.address=:9000/tcp
- --entrypoints.web.address=:8000/tcp
- --entrypoints.websecure.address=:8443/tcp
- --api.dashboard=true
- --ping=true
- --metrics.prometheus=true
- --metrics.prometheus.entrypoint=metrics
- --providers.kubernetescrd
- --providers.kubernetesingress
- --providers.kubernetesingress.ingressendpoint.publishedservice=kube-system/traefik
- --entrypoints.web.http.redirections.entryPoint.to=:443
- --entrypoints.web.http.redirections.entryPoint.scheme=https
- --entrypoints.websecure.http.tls=true
- --log.format=json
- --log.level=INFO
- --accesslog=true
- --accesslog.format=json
- --accesslog.fields.defaultmode=drop
- --accesslog.fields.names.ClientHost=keep
- --accesslog.fields.names.RequestHost=keep
- --accesslog.fields.names.RequestMethod=keep
- --accesslog.fields.names.RequestPath=keep
- --accesslog.fields.headers.defaultmode=drop
- --accesslog.fields.headers.names.Cf-Ipcountry=keep
- --accesslog.fields.headers.names.User-Agent=keep
- --accesslog.fields.headers.names.X-Forwarded-User=keep
- --log.level=INFO
- --providers.kubernetescrd.allowCrossNamespace=true
- --providers.kubernetescrd
- --experimental.plugins.traefik-real-ip.modulename=github.com/soulbalz/traefik-real-ip
- --experimental.plugins.traefik-real-ip.version=v1.0.3
- --entryPoints.web.forwardedHeaders.insecure
- --entryPoints.websecure.forwardedHeaders.insecure
image: rancher/mirrored-library-traefik:2.9.4
Thanks!