Last week we had to migrate about 1000 customers in our DB which lead to over 3000 new ressources created in consul kv that were then picked up by traefik.
This lead to enourmous resouce spikes that eventually ended in traefik being forecfully stopped by our EKS cluster.
No we are back to normal, but I am still wondering, what caused this spikes.
I went thousands of log lines, but the best I can do, is guess.
However, I want to understand it and prevent future events like this.
Can anyone point me into a direction?
I searched Github for possible related memory leaks or such in combination with kv, but found nothing relevant.
Was it just to much updates at once for traefik to handle? Or could something like an error with a router cause this spikes?
Because the way we fixed it, was by rollbacking our changes and removing all the newly created routes.
The environment is:
- EKS version 1.32
- Traefik 3.6.7
Thanks in advance for any hint!
Leo