Health Check failing with Network Load Balancer on EKS

I figured out the issue and it was nothing to do with Traefik. I was mistaken in thinking that the 503 error was a health check in Traefik when it was actually a health check from kube-proxy. For anyone interest in the solution:

Ultimately, this was due to the fact that my EC2 instance's hostname was different than the node name registered in Kubernetes. For whatever reason, cube-proxy does not like that. So I had to set my cube-proxy hostname to the same value as my node's name in the control plan.

This was done by patching the kube-proxy daemon set with:

env:
- name: NODE_NAME
    valueFrom:
    fieldRef:
        apiVersion: v1
        fieldPath: spec.nodeName

Then you need to reference that NODE_NAME variable in your --hostname-override flag.

- command:
  - kube-proxy
  - --hostname-override=$(NODE_NAME)
  - --v=2
  - --config=/var/lib/kube-proxy-config/config

Now the health check is passing and Traefik is working flawlessly with a Network Load Balancer in EKS.

1 Like