I have a very weird situation.
I am using traefik in docker, with 40+ other docker containers. Many of them are being proxied by Traefik 3.1.4. They all work great (for over a year) except one newer container that constantly stops working every 2-3 days and gives bad gateway errors. I am using the same configuration to hand out an SSL certificate and proxy the container.
Here's the bizarre part, if I restart traefik, the docker container, or any other container, the problem goes away and it's back up and responding again.
docker config for the container
networks:
- proxy
- vikunja
labels:
traefik.enable: true
traefik.http.routers.vikunja.rule: Host(`tasks.example.com`)
traefik.http.routers.vikunja.entrypoints: websecure
traefik.http.routers.vikunja.tls.certResolver: myresolver
networks:
proxy:
external: true
vikunja:
This is the same four lines I add to every docker container I require SSL for.
The container will work perfectly and very fast for 2-3 days, then one day it will stop responding and I will get a "bad gateway error" or it is just really slow to time out.
Initially I was restarting the container, and it fixed the problem. Then I realized restarting Traefik fixed the problem as well. Then I stopped another container I wasn't using anymore and I noticed it fixed the problem too. This makes me feel like it is traefik and not the container. But this is the only container that does it, all my other 25+ ssl proxied containers run flawless.
I have looked at the logs off the container and traefik, and no error messages or any messages at all. Traefik runs error free, the container stops updating logs after this happens (as It is likely not receiving traffic anymore).
I'm completely lost as to what else to look at. It's bizarre it is only one container, and how doing anything that touches traefik seems to fix it for the next 2-3 days until it happens again. I've been using this setup for over a year, and it has worked fantastic for all the containers except for this one particular one.
When the problem happens, I have entered into the problem container, and confirmed it was still up and running, I could reach the app when using it's internal local docker ip and port, but I just can't via traefik proxy.