Service Health Checks

cinatic · June 5, 2021, 2:41pm

Hi,

i read that health checks are not available on KubernetesIngress and that LivenessProbe should be used, however i had the following issue and now have some doubts using LivenessProbe.

Node C on the cluster crashd, restarted and came up almost properly. Pods, Service and CLuster communication worked again but due to networking configuration issues the Pods from Node C were not reachable by other nodes. And so also traefik from Node A was not able to reach Node C but Pods on Node C reported everything is ok because local liveness probe was OK and therefore no alerts has been generated.

After i read the documentation and i figured out that traefik_service_server_up is something i am not able to use I decided to change the LivenessProbe to test the external traefik endpoint.

Now i worry that if for some reason the external Probe was not successfull but the Pod is perfectly fine actually and the failure is somewhere else then recycling the pods is a waste of resources? Imagine recycling huge amount of pods only because dns, loadbalancer or tls is not working properly.

daniel.tomcej · June 7, 2021, 11:01pm

Hello @cinatic,

Traefik uses the kubernetes API to generate a list of service endpoints to forward traffic to. Kubernetes populates service endpoints with pods that pass both the liveness and readiness checks. Note that liveness and readiness are not the same.

Therefore endpoints that are reported to Traefik are already passing the kubernetes healthchecks, and duplicating that functionality within Traefik would be meaningless, as every endpoint has already passed.

Traefik as with most applications relies on the underlying networking of the orchestration platform. It cannot detect networking configuration or issues, nor is it designed to. If your cluster has internal networking issues after a node restart, then that should be your first priority on resolving.

For more information on liveness and readiness probes, the kubernetes documentation has a great description: (Pod Lifecycle | Kubernetes)

Topic		Replies	Views
Why the Traefik health check is not available for kubernetesCRD and kubernetesIngress providers Traefik v2 kubernetes-crd	6	3917	March 18, 2022
Traefik as NodePort Service behind Application Load Balancer Traefik v2 kubernetes-crd , kubernetes-ingress	0	752	October 21, 2021
Using helm deployed traefik v2 to load balance external services Traefik v2 kubernetes-crd	3	1192	June 1, 2021
Liveness probe failed: Get "http://XX.XXX.XXX.XXX:9000/ping": dial tcp XX.XXX.XXX.XXX:9000: connect: connection refused Traefik v2 kubernetes-ingress	1	675	April 26, 2023
How to setup health checks for Traefik? Traefik v1 kubernetes-ingress	3	3123	July 2, 2019

Service Health Checks

Related topics