Scenario Check: Four Docker Hosts - will Traefik work for me?

Summary

I've often seen Traefik cited as the best method for solving a problem I have, and I wanted to present my environment to the community in the hopes of checking if Traefik Proxy would provide the solution I need.

I have four physical servers each hosting multiple docker containers. Some of these services need to be aware of containers on other servers. Over the past year and a half I’ve tried different methods to achieve this:

  • Default Bridge network
  • Attachable Overlay network (without a Swarm)
  • Moving everything possible to the Host network
  • Considering a macvlan network

It was when I was researching the idea of a macvlan network that I found a reply to a thread on network best practices that suggested the use of Traefik - and I have previously come across Traefik examples that seemed like they would fit my needs.


Environment

I have four servers with a minimal Linux installation and the docker engine and compose plugin, installed the approved way.

  • OS Version/build
    • Debian 11.5 bullseye (or Raspberry Pi OS equivalent)
  • App versions
    • Docker version 23.0.0, build e92dd87
    • Docker Compose version v2.15.1

For the four servers running Debian 11.5 headless, there is the following configuration:

  • Hardware
    • 2 x Edge servers : Raspberry Pi 4b RAM 2GB SSD 120GB
    • 1 x Services server : Intel NUC7CJYHN RAM 16GB SSD 500GB
    • 1 x Security server : Intel NUC8i3BEH RAM 32GB SSD 250GB
  • Container Management
    • Exclusive use of docker-compose.yml files for managing containers
    • Configuration-as-code stored in private github repositories
    • Server specific information held in .env.example files
    • .env files created before container start

Topology

My two edge servers provide local manual redundancy. All servers run common logging services, along with specific services based on their role.


Questions

  1. Can an instance of Traefik Proxy on an Edge server manage and discover services on separate physical hosts? Is it simply a matter of adding a proxy docker socket connection for each host?

  2. If this isn't possible, is Traefik still valid for local discovery, with an installation on each physical host, and all external calls managed by Nginx Proxy Manager?

  3. If it is possible without autodiscovery - i.e. I'd have to manually add entries for each service via http://[host]:[port]/ - then is this possible via TOML/YAML or docker-compose.yml configuration?

  4. If I host Traefik Proxy on my edge servers, is there a configuration that would enable them to cluster or be aware of each other?

  5. I currently host 59 services across 4 hosts, 12 of which are exposed to the internet as subdomains on a URL. I do not want to expose all 59 to the internet, is there a way with Traefik Proxy to set some of them to an internal domain (for example, ending my TLD with .lan rather than .com or similar)? I would manage the internal domain using dnsmasq.d as part of Pi-Hole, if this is possible.

  6. Can any of the containers run in network_mode: host or is it best practice for them all to be on a custom Bridge network?

  7. My domain is managed through Azure DNS Zones. I have subdomain CNAME records which point to [domain].duckdns.org which are then routed to my IP address. My router then forwards requests on ports 443 and 80 to Nginx Proxy Manager. These calls are then routed to the correct services by NPM. Will this configuration (Azure -> DuckDNS -> Router -> Traefik still work?

Thank you for reading this, and for any insights!

Somehow this sounds complicated. Why not simply use Docker Swarm? It works great with Traefik Configuration Discovery and Docker services running on multiple nodes.

We have 3 Docker Swarm manager nodes running Traefik with ports 80+443 in host-network mode. We use a managed load-balancer in front of them for high availability. All 100+ services are running spread on separate servers, everything is connected with a Docker overlay network. Services to be exposed use labels to indicate their subdomain, Traefik Configuration Discovery handles the rest, forwarding requests round-robin to available service containers.

Portainer is a nice GUI that can manage Docker Swarm services if you don't like the CLI.

I am glad Docker Swarm works for you. For me, I have four servers of vastly different capabilities, so using a swarm or any kind of orchestration is not what I need.

I would say different capabilities is not an issue at all. You can assign labels to nodes and constrain services, so they only run on desired nodes. And everyone using Docker Swarm loves it light weight, even running on RPi.

Local Docker discovery works fine. In theory you should be able to use provider.docker to connect to a different server and use Configuration Discovery, maybe even twice with two different servers. But they probably need to have a common Docker network.

Alternatively you can create a dynamic configuration file traefik-dynamic.yml, and load and watch it with provider.file in static config. You need routers and services.

Clustering might be possible with Traefik EE, not with the open source version - AFAIK.

For internal services you can use routers just matching internal domains. For security you should also check for internal IP.

When you set services up manually, it shouldn’t matter what network is used, it just needs to be reachable from Traefik.

Duckdns should work, at the end they only resolve a domain name to an IP address.

Nice idea.

I went with individual docker hosts and Nginx Proxy Manager instead. Works well for my needs.

Revising this, I can confirm that the services I am exposing via containers hosted on 4 separate docker hosts can all be accessed via http://[server]:[port] and because of this I have options.

  • Use the file provider and connect to the services directly
  • Use traefik-kop with redis

I'm going to explore both of these options.

Know this is old. But im facing the same "issue". I think i might just do it manually in dynamic config.
But considering OP, i would say docker swarm is limited by the need to pass hardware to containers in some configurations. It is in my use case (Zigbee dongle + GPU for transcoding)

I have since simplified my implementation of Traefik to use a static file for the general configuration, and multiple dynamic files for each server (and two extra for the certificate store & a common middleware).