We're running into an issue with connections from pods in our EKS clusters to NLB withTraefik 2.x ingress controller. When a connection made from a pod to the NLB gets routed to a Traefik replica on the same node as the pod, the connection times out. This appears to be caused by the "Preserve client IP address" setting on the NLB target group preventing "hairpin" / "u-turn" NAT: Troubleshoot your Network Load Balancer - Elastic Load Balancing
We use the ipwhitelist middleware extensively, so I'm not sure that disabling "preserve client ip address" will be feasible, but is it possible to configure this setting from the Traefik Helm chart?
Otherwise, is there any solution here aside from running Traefik on separate nodes from workload pods that may need to make connections to the NLB?
It looks like one solution is to disable "Preserve client IP address" and enable proxy protocol on both the ELB and Traefik. I think this should be done via annotations on the LoadBalancer type service:
The next issue is that this causes the ELB target group health check to fail. I see in the service the following config:
- name: web
- name: websecure
I don't understand where
31876 comes from, but it corresponds to kube-proxy. When proxy protocol is enabled this fails.
Wouldn't it be preferable to use the Traefik service nodePort for the ELB target group healthchecks anyway? Any clues to how this is configured?
It turns out those annotations aren't supported by the in-tree AWS controller. It's necessary to use the AWS Load Balancer Controller to correctly configure the target group options.