We have traefik configured in our kubernetes clusters, about 20 nodes. The traffic increased to 2K requests/minute then our traefik daemon died, here is sequence of log observed.
- kubectl received SyncLoop: DELETE for traefik and killed the running traefik container
- kubectl trying to start the same container but container not found and it gave up after 5 retries.
- kubectl trying to pull traefik image
1.17.4
(future version #) which is not exists in docker hub. After few retry, it gave up. - At this time.. all traefik pod (daemonset) died and noticed a new daemonset configuration was created and old one was deleted without anyone touching the system.
- After 15 mins, we added a new node and it started. Also, all existing nodes received SyncLoop: ADD for traefik then new version of traefik image,
1.7.14
was downloaded and traefik started in all nodes.
The question here is, why traefik is trying to download 1.17.4
not-released version from docker repo? Why daemonset got deleted and created again?
Thanks in Adv,
-Sridharan