HTTPS on IPv6 stops working after a long time

We're running Traefik 2.4 in docker for the website of one of our local supermarket chains, and we've noticed that HTTPS stops working via IPv6 after a while. Like, a month or so. HTTPS via IPv4 keeps working however.

Restarting the Traefik container solves the issue.

We run other sites using Traefik 2.4 on dual-stack environments and haven't seen this issue come up there, so this seems like an issue that crops up with a decent amount of traffic over longer periods of time.

Restarting the container once in a while isn't really an issue, as it takes a few seconds. But this seems like an IPv6-specific leak somewhere, somehow. And it seems like the decent thing to do to mention it here...

hello @SanderAtSnakeware

Thanks a lot for using Traefik.

Would it possible to have any reproducible scenario for that use case? That would be super helpfull to provide solutions or fix if we are facing any bug here.

Thank you,

Hi @jakubhajek,

I'm not sure if I can find a reproducible scenario. Until now this has occurred only at a high volume site and only with IPv6 and HTTPS.

Maybe this is doable using a traffic generator that does http/https requests on both IPv4 and IPv6, for a long period of time. But this might be a memory leak or something that is visible earlier, before HTTPS via IPv6 breaks.

Hello @SanderAtSnakeware

Do you have any specific logs that might be related to the issue you are experiencing? Having a reproducible use case would be great in order to replicate it in a local lab environment.

I also wonder why you are suspect that the problem is IPV6 specific. I am here to help you but we need more details to find the root cause and prepare a fix for that.

Thank you for you kind collaboration,
Jakub

Hi @jakubhajek,

I think I have to wait until this happens again and check the traefik logs. This might take some time.

The reason I believe this is IPV6 specific is that only connections to the containerized website using IPV6 stop working when this happens. However, connections using IPv4 keep working and these are made to the same container. So the only difference is IPv6 vs IPv4 in this case.

I think I'll set up specific IPv6-only monitoring for this site so I'm alerted as soon as this happens again.

Is it possible to set the Traefik logging to debug while it displays this behavior? Or does that require a restart of the Traefik container, thus eliminating the problem?

Hello @SanderAtSnakeware

You can set the log level to DEBUG but not in the case when your problem really occurs. The log level is an attribute of Traefik static configuration that means it required to restart Traefik to see the change.

Thank you, Jakub

Well, just had this issue again. Nothing to see in the logs or console, Traefik restart fixed it.

I got a notice the site was unavailable so I checked it on an IPv6 enabled machine and indeed, Chrome gave a connection reset error. I also tried the same site without HTTPS and that didn't work as well.

So, it seems the issue isn't tied to HTTPS specifically, but to IPv6.Also, this issue only came up after the container was running continuously for over six weeks.

Interesting.