Sporadic 502 on single instance, not every

I have a setup where Traefik is running behind corporate PLB in Kubernetes, but then it handles all traffic from there.

For a services that is failing, we run 3 instances of it; ms-1, ms-2, ms-3 through Kubernetes deployment.
Traffic is directed to each in round robin.

Sometimes, we observe random 502 on ONE of the nodes and requests are not forwarded to any node. After a while, it auto corrects and connection is restored for the failing one.

level=debug msg="'502 Bad Gateway' caused by: dial tcp x.x.x.x:8100: connect: connection refused"


  1. Does Traefik retry sending/forward to other replica nodes, if any 1 is failing? Doesn't seem to happen here as the http request returns 502.
  2. Why would this issue come and what can be fixed?

I read other topics and mostly find suggestions to use the correct port, ip, service name, etc. That doesn't seem the case here as 1 of the 3 node fails, that too sometimes.