Why does Traefik container and exposed containers need to share the same network?

Hi everyone,

I'm new to Traefik and inherited a Docker host used a s reverse proxy setup for exposed HTTPS vs. internal test web apps. That host has two networks, one Docker default 172.17.0.0/16 and one traefik_default with 172.18.0.0/16. New containers were added to the Docker default network by default and Traefik reverse proxy-stuff didn't work for those at all. While the logs looked OK, like Traefik logged for all containers of that network to setup routers and endpoints etc., actual communication ddin't work with all kinds of errors like HTTP status 499 client closed connection, gateway timeouts etc. Only when putting the containers in the Traefik network communication workes as well, with using those IPs in the Traefik logs.

Only by accident I found the following sentences in the Googlea search results, but only in the preview as well, I didn't stumble accross anything in the docs:

The Traefik container has to be attached to the same network as the containers to be exposed . If no networks are specified in the Docker Compose file, Docker creates a default one that allows Traefik to reach the containers defined in the same file.

grafik

Where does that restriction come from? Has it to do with routing, bound ports on addresses or ...?

Why two different networks might have been created at all? There are no docs anymore for the inherited host covering these technical details. What I've found so far was that a docker-compose file was used at some point and I've read some docs that in this setup an additional network might be created automatically for all contained services. Or are there some other, more relevant/likely security considerations?

And why does Traefik log like it's working wioth the wrong network? It does know it's own network and should be able to warn in case of a wrong setup.

Thanks!

When using Traefik Docker configuration discovery (providers.docker with labels on target service), you need to make sure that Traefik and target service are on the same Docker Network, as that is used to connect.

When using multiple Docker Networks, make sure to set .docker.network globally in static config on provider or on target service labels. Traefik will fetch the IP, but might try to forward requests via wrong network.

Alternatively, you can manually configure target services (with a dynamic config file), there you can enter any target domain or IP you want (even external), which Traefik can reach.

We prefer dedicated Docker Networks, see simple Traefik example, not sure why default is used in your case.

But why, where does the restriction come from? We have proper routing between different networks. So what does Traefik need to place in one network only to work, why can't it listen on one IP in one network, connect to others in other networks etc. That's how many other network related tools work as well, bind on some address and talk to other, as long routing and firewalling etc. work. But for Traefik that doesn't seem to be the case at all.

Sorry, I don’t understand the question.

Traefik can work with multiple (Docker) networks. It can listen directly in the hosts ports (host network) and connect to every target service using a different Docker network if you like. Of course, Traefik needs to be part of all the target service networks.

Have a look at the following data, 172.17. is the default Docker network, 172.18. the one created for Traefik.

[...]@[...]:~$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         _gateway        0.0.0.0         UG    0      0        0 ens160
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
172.18.0.0      0.0.0.0         255.255.0.0     U     0      0        0 br-9628d8a7c930
192.168.131.0   0.0.0.0         255.255.255.0   U     0      0        0 ens160

[...]@[...]:~$ ip route show
default via 192.168.131.254 dev ens160 proto static
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
172.18.0.0/16 dev br-9628d8a7c930 proto kernel scope link src 172.18.0.1
192.168.131.0/24 dev ens160 proto kernel scope link src 192.168.131.151

[...]@[...]:~$ traceroute 172.17.0.3
traceroute to 172.17.0.3 (172.17.0.3), 30 hops max, 60 byte packets
 1  172.17.0.3 (172.17.0.3)  0.050 ms  0.005 ms  0.003 ms

[...]@[...]:~$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1

It's been a while since I worked on that network level, but form my understanding this means that routes for all existing networks are known and consequently, as the host I can reach clients in all of those networks. Therefore I expected that from within the individual 172.17. and 172.18. the other networks would be reachable as well. And Traefik in 172.18 did recognize containers in 172.17., it logged to successfully created services with IPs from that other network etc. It's just that actually forwarding all requests from outside through 172.18. into 172.1.7.0.3:3010 didn't work or reading the respones from there, not sure.

And that's the part I don't understand: Who blocks communication between Traefik in 172.18. and a container in 172.17., with Traefik not being in the same network, if the host has correct routes, the host is able to reach all clients in all networks and Traefik even logs that it creates services in/for the wrong network without any error.

Logs like the following:

time="2024-04-25T14:34:36Z" level=debug msg="Creating middleware" middlewareType=Pipelining entryPointName=http routerName=portainer@docker serviceName=portainer middlewareName=pipelining
time="2024-04-25T14:34:36Z" level=debug msg="Creating load-balancer" routerName=portainer@docker serviceName=portainer entryPointName=http
time="2024-04-25T14:34:36Z" level=debug msg="Creating server 0 http://172.17.0.3:8000" entryPointName=http routerName=portainer@docker serviceName=portainer serverName=0
time="2024-04-25T14:34:36Z" level=debug msg="child http://172.17.0.3:8000 now UP"
time="2024-04-25T14:34:36Z" level=debug msg="Propagating new UP status"
time="2024-04-25T14:34:36Z" level=debug msg="Added outgoing tracing middleware portainer" middlewareName=tracing middlewareType=TracingForwarder entryPointName=http routerName=portainer@docker

I get these logs with Portainer and Traefik being in different networks and forwarding requests actually doesn't work in the end. But I don't understand where the blocking part is. I had non-container services in different networks in the past as well: A server having multiple OpenVPN interfaces and one service listened to all of these interfaces for incoming packets without a problem. I thought it's the same setup like with Traefik and Docker.