11K routers in K8s, high CPU load on pod / service churn, is this expected and am I doing it right?

funkypenguin · March 2, 2024, 3:39am

Hey folks!

I'm using the latest Traefik v2 to provide ingress at ElfHosted - here is my helmrelease.

We have 20 nodes, and I'm running Traefik as a daemonset, so I have 20 Traefik pods. There are currently close to 4000 tenant pods in the cluster.

Recently when we shut down all 4000 pods for maintenance, Traefik's CPU load increased significantly during scale-down and scale-up:

AFAIK, each Traefik node has no awareness of the others, so presumably they're all hitting the kube-apiserver at once, refreshing themselves each time one of the ingressroutes changes?

Is this expected behaviour (the CPU load), am I doing this right, or is there a better way?

Thanks!
D

Topic		Replies	Views
High CPU usage? Traefik v2 kubernetes-crd , kubernetes-ingress	0	1047	October 18, 2019
Single pod in kubernetes cluster has much higher memory, cpu, network than rest of replicas Traefik v2 kubernetes-ingress , metrics , tracing	0	142	March 28, 2024
Very high CPU usage with only moderate traffic Traefik v2 docker , file	2	1753	May 4, 2023
Traefik deployment pods scalability Traefik v2 kubernetes-ingress	0	289	February 15, 2021
Traefik oom killed handling 20k ingresses Traefik v2 kubernetes-ingress	2	308	May 27, 2024

11K routers in K8s, high CPU load on pod / service churn, is this expected and am I doing it right?

Related topics