First of all, thank you to all Traefik devs, it's really pleasant to work with Traefik.
Let me draw a bit the context:
Traefik tested version 2.0, 2.1, 2.2
We use Traefik as a Kubernetes Ingress controller
Our Kubernetes flavor is AWS EKS version 1.14
Now the issue we are facing:
Traefik has trouble handling multiple consecutive server-streaming gRPC calls in a single channel. More precisely, when the client cancels a call and starts a new one, Traefik sometimes fails to propagate the cancel to the backend, leaving the call open on the backend and running forever.
To avoid false-positive we removed Traefik from the route, and the issue disappeared.
We also used h2c protocol as mentioned here: https://docs.traefik.io/user-guides/grpc/
If someone has a starting point for this problem, it will be appreciated.
I wish you all a very good day.
Best regards,
Julien.
Would you happen to have an sample application and configuration that could be used to replicate this issue? Some items that would be useful for our team to investigate would be:
Client / Server application and deployment manifest
Instructions on how to replicate the issue via the Client
Steps to observe the issue on the server, such as metrics or tracing
I realize that's quite a bit of an ask, but your help is greatly appreciated in helping us track this down.