First of all, thank you to all Traefik devs, it's really pleasant to work with Traefik.
Let me draw a bit the context:
- Traefik tested version 2.0, 2.1, 2.2
- We use Traefik as a Kubernetes Ingress controller
- Our Kubernetes flavor is AWS EKS version 1.14
Now the issue we are facing:
Traefik has trouble handling multiple consecutive server-streaming gRPC calls in a single channel. More precisely, when the client cancels a call and starts a new one, Traefik sometimes fails to propagate the cancel to the backend, leaving the call open on the backend and running forever.
To avoid false-positive we removed Traefik from the route, and the issue disappeared.
We also used h2c protocol as mentioned here: https://docs.traefik.io/user-guides/grpc/
If someone has a starting point for this problem, it will be appreciated.
I wish you all a very good day.
Would you happen to have an sample application and configuration that could be used to replicate this issue? Some items that would be useful for our team to investigate would be:
Client / Server application and deployment manifest
Instructions on how to replicate the issue via the Client
Steps to observe the issue on the server, such as metrics or tracing
I realize that's quite a bit of an ask, but your help is greatly appreciated in helping us track this down.
Ok, here we are. One of our developer just create this repo to enlighten this issue:
If we run just few connection everything is working as expected, but if we increase the amount we raise this issue regarding gRPC and Traefik.
Also, we localy did some test with and without Traefik, and we was not able to reproduce when Traefik is not on the path.
If you have some questions, feel free to ask.
@Julien Thank you for the update and reproducible build. I've confirmed what appears to be a bug and opened an issue on the official Treafik repo here: https://github.com/containous/traefik/issues/6791
Good morning @notsureifkevin. Glad to help improving Traefik.
I look forward to deploy a patched version
By the way, If you need us to test a patch, just let me know.