Pingable docker socket proxy (falsely?) reports as unresolvable at times

PathNotFound · May 28, 2025, 1:22pm

I am trying to harden my environment some and having some strange behavior I am hoping someone can help me understand.

I am incorporating a docker socket proxy in my environment. My dev portion is a 3 node Swarm with two Leaders.

I had it working over all services that needed the socket (3 I think) but when progressing I noticed that Traefik would sporadically report it could not resolve the proxy's address... and changed the def from a tcp:// protocol to http://. The (in)specific error is as follows:

Provider connection error error during connect: Get \"http://dockerproxy:2375/v1.24/version\": dial tcp: lookup dockerproxy on 127.0.0.11:53: no such host, retrying in 610.626583ms

Despite the retry this will not work again.

Strangely if I open an interactive shell in Traefik, I can ping dockerproxy by name and it does resolve. More strange than that, to troubleshoot I created a config that only had Traefik and the socket proxy and tested. When having this issue.... I can actually see activity (assumed from Traefik since it is the only running service) over the socket...


time=2025-05-28T13:09:37.430Z level=INFO msg="socket-proxy running and listening..."
time=2025-05-28T13:09:37.430Z level=DEBUG msg="watchdog running"
time=2025-05-28T13:09:37.786Z level=DEBUG msg="allowed request" method=GET URL=/v1.24/version client=10.0.4.4:50798
time=2025-05-28T13:09:37.792Z level=DEBUG msg="allowed request" method=GET URL=/v1.24/services client=10.0.4.4:50798
time=2025-05-28T13:09:37.794Z level=DEBUG msg="allowed request" method=GET URL=/v1.24/version client=10.0.4.4:50798
time=2025-05-28T13:09:37.800Z level=DEBUG msg="allowed request" method=GET URL="/v1.24/networks?filters=%7B%22scope%22%3A%7B%22swarm%22%3Atrue%7D%7D" client=10.0.4.4:50798
time=2025-05-28T13:09:37.802Z level=DEBUG msg="allowed request" method=GET URL="/v1.24/tasks?filters=%7B%22desired-state%22%3A%7B%22running%22%3Atrue%7D%2C%22service%22%3A%7B%22dpz9mw28zcmu882ztc166l0eq%22%3Atrue%7D%7D" client=10.0.4.4:50798
time=2025-05-28T13:09:37.803Z level=DEBUG msg="allowed request" method=GET URL="/v1.24/tasks?filters=%7B%22desired-state%22%3A%7B%22running%22%3Atrue%7D%2C%22service%22%3A%7B%22jbva8z9glfm3ez231xq7fyom9%22%3Atrue%7D%7D" client=10.0.4.4:50798
time=2025-05-28T13:09:52.806Z level=DEBUG msg="allowed request" method=GET URL=/v1.24/services client=10.0.4.4:50798
time=2025-05-28T13:09:52.807Z level=DEBUG msg="allowed request" method=GET URL=/v1.24/version client=10.0.4.4:50798
time=2025-05-28T13:09:52.813Z level=DEBUG msg="allowed request" method=GET URL="/v1.24/networks?filters=%7B%22scope%22%3A%7B%22swarm%22%3Atrue%7D%7D" client=10.0.4.4:50798
time=2025-05-28T13:09:52.815Z level=DEBUG msg="allowed request" method=GET URL="/v1.24/tasks?filters=%7B%22desired-state%22%3A%7B%22running%22%3Atrue%7D%2C%22service%22%3A%7B%22dpz9mw28zcmu882ztc166l0eq%22%3Atrue%7D%7D" client=10.0.4.4:50798
time=2025-05-28T13:09:52.816Z level=DEBUG msg="allowed request" method=GET URL="/v1.24/tasks?filters=%7B%22desired-state%22%3A%7B%22running%22%3Atrue%7D%2C%22service%22%3A%7B%22jbva8z9glfm3ez231xq7fyom9%22%3Atrue%7D%7D" client=10.0.4.4:50798

So it is actually resolving and communicating... just reporting it is not... sometimes. Cycling the stack sporadically reproduces the issue.

I thought maybe this could be because Traefik and the proxy socket are not on the same node, but experimenting with that theory disproved it. I set the proxy to run globally and although that seemed to reduce the issue, I still found the issue occurring.

I am super confused and this reads to me like an issue in Traefik's reporting of service resolution. Any ideas? TiA

bluepuma77 · May 28, 2025, 1:41pm

Also practicing improved security with my own Docker socket proxy (repo). Sometimes restarted/recreated Docker target services/containers are not recognized.

Restarting Traefik, proxy or target service helps in that case, but is not an ideal solution.

PathNotFound · May 28, 2025, 3:53pm

Thanks for the response. For me, this is not even a solution but a workaround, and a less than ideal one at that.

Do you report the same behavior? Is the protocol for the configured socket target in Traefik changed from tcp:// to http:// ? Does it report it is not resolvable? Can you ping the same name from an interactive shell with Traefik?

To me, this is like being asked to restart your computer up to three times checking if it is fixed for an application issue. How do you propose monitoring for failure and resolution? I was going to have Prometheus do its thing like I do with other services, but I was not sure I wanted it on my socket network.

PathNotFound · May 28, 2025, 9:02pm

I found here is another instance of this with another socket proxy mirroring my experience:

github.com/Tecnativa/docker-socket-proxy

Error accessing proxy from traefik

opened 05:54AM - 17 Jul 23 UTC

lonix1

I'm using this with traefik and it works like a charm. ...Except when it does…n't. For example, when I must update services, and then **restart traefik**. Then traefik restarts without error, and the container becomes "healthy". But requests to services time out. I thought it was a traefik problem. Then I checked the logs, and every time I have this issue I see something like this: > time="2023-07-17T07:44:45+03:00" level=error msg="Provider connection error error during connect: Get \"http://tecnativa:2375/v1.24/version\": dial tcp: lookup tecnativa on 127.0.0.11:53: no such host, retrying in 9.503388201s" providerName=docker The "no such host" part is incorrect, because the tecnativa container is working, and reported as "healthy". I wonder what that `v1.24` is? The only solution is to restart both traefik and tecnativa (and sometimes other services which are served by traefik!) I am using the most recent official tecnativa release 0.1.1. And docker 24.0.4 on debian. Any ideas?

Is this is known issue that has been accepted as such yet? I see guides on how to make this config on docker and traefik's site, so it must be a supported and even desirable config from a security lens

PathNotFound · June 19, 2025, 10:53am

I think that the error I am getting, citing the wrong protocol and everything, is (as it appears to be) a resolution issue. I believe the root cause was a race condition where simply Traefik was seeking the endpoint prior to the endpoint having started up successfully. I am not quite sure how to set a delay yet (but also have not looked into it) but have this working in this config.

Thanks.

system · June 22, 2025, 10:53am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Traefik cannot find the docker socket proxy Traefik v2 docker	10	1905	January 6, 2024
Problems with Docker Swarm and Traefik 2.x Traefik v2 docker-swarm	3	7911	December 15, 2023
Traefik socket proxy connection timeout error Traefik v3 (latest) docker	17	1406	May 30, 2024
Traefik+docker socket proxy / doesn't seems to work Traefik v2 docker	34	3512	April 8, 2024
Unable to diagnose why requests result in 404 Traefik v2 docker-swarm	8	1826	September 13, 2021

Pingable docker socket proxy (falsely?) reports as unresolvable at times

Related topics