Exploring the Tailscale-Traefik Integration | Traefik Labs

Last month, we announced the release of the first beta for Traefik Proxy 3.0, and with it came the exciting new integration with Tailscale, a VPN service that allows you to create your own private networks from your home, using whatever device you want.

But Tailscale goes beyond providing a service to create a private network. It also offers TLS certificate management, where Tailscale provides you with a valid certificate for your internal Tailscale services. Behind the scenes, Tailscale gets the certificate from Let’s Encrypt. The biggest benefit here is that Tailscale manages the certificate lifecycle for you, so there is no need to worry about renewing or exposing an endpoint to resolve TLS challenges between Let’s Encrypt and your proxy instance.

In this article, I want to show you the two main ways Traefik Proxy makes use of Tailscale — one based on the utilization of the TLS management feature and one bonus story for nerds!

You can also check out the announcement of the Tailscale-Traefik integration on the Tailscale Blog.

Tailscale as a TLS certificates provider

Tailscale is, first and foremost, a VPN, which means all traffic between the nodes of your tailnet is already encrypted by WireGuard. If you're running, for example, a webserver on one of your nodes (i.e. your server), and you want to reach it from another of your nodes (i.e. your laptop), there is no need for HTTPS, in terms of security, and you could do it over HTTP.

However, software at the application level (e.g. your browser) is unaware that traffic is already encrypted, and it might, rightfully so, "complain" about it — your browser will display the Not secure warning near the URL bar. Other tools might even be stricter about it.

For this reason, Tailscale also offers a (beta) feature for HTTPS certificates, which provides you with a Let's Encrypt TLS certificate for the nodes in your tailnet. Once this feature is enabled, instead of your laptop reaching your server with http://your-server-tailscale-IP, you can reach it withhttps://your-server-tailnet-name — assuming your server can do HTTPS as well — making your browser happy, as it sees you are using TLS, and your life easier.

If you are interested in trying this feature without Traefik Proxy, you need to follow the steps below:

  • Set up the Tailscale bits
  • Set up your webserver (or reverse proxy) to handle TLS
  • Call tailscale cert on your server to ask Tailscale to provide you with the TLS certificate
  • Adjust your webserver configuration to take that certificate into account
  • Handle certificate renewal later on

Automating TLS certificates with the Traefik-Tailscale integration

Now, if you want to try this Tailscale feature with Traefik Proxy, you have a way of automating this process. Traefik comes with an ACME provider, which can be configured to automatically ask Let's Encrypt for certificates, for the relevant routes described on its configured routers.

In that respect, Tailscale's role as a certificate provider is very similar to Let's Encrypt, so it made sense for us to capitalize on the experience we already had with the ACME provider, and adapt the work to add the same feature for Tailscale.

Let's showcase the feature with an example of a setup from A to Z.

  1. Start with the Tailscale part: In your tailnet’s DNS settings, enable MagicDNS, make a note of your tailnet name for later (in the example below, you'll have to replace yak-bebop.ts.net with your own), and enable HTTPS Certificates
  2. Configure Traefik Proxy: We'll use the file provider for simplicity, but there are examples for the other providers on our documentation page that you can easily adapt for this example.

Static configuration:

[entryPoints]
	[entryPoints.websecure]
		address = ":443"
[providers]
	[providers.file]
		filename = "/path/to/your/dynamic.toml"
[certificatesResolvers.myresolver.tailscale]
[api]
	debug = true
[log]
	level = "DEBUG"

Dynamic configuration:

[http]
	[http.routers]
		[http.routers.towhoami]
			service = "whoami"
			rule = "Host(`myserver.yak-bebop.ts.net`)"
			[http.routers.towhoami.tls]
				certResolver = "myresolver"
	[http.services]
		[http.services.whoami]
			[http.services.whoami.loadBalancer]
				[[http.services.whoami.loadBalancer.servers]]
					# docker run -d -p 6060:80 traefik/whoami
					url = "<http://localhost:6060>"

Note: In the Host rule, we're using our full Tailscale hostname for the server — the concatenation of the server's machine name, myserver, (that you can find in your Tailscale admin console, or with the tailscale status command), and the tailnet name, yak-bebop.ts.net, that is provided with MagicDNS.

  1. Start Traefik Proxy: On startup, Traefik should automatically try to get certificates for TLS routes with a Host rule, which, in our example, means that, if everything goes well, you should see log lines (if your log level is DEBUG) such as:
2023-01-09T11:02:51+01:00 DBG ../../pkg/server/router/tcp/manager.go:235 > Adding route for myserver.yak-bebop.ts.net with TLS options default entryPointName=websecure
2023-01-09T11:03:36+01:00 DBG ../../pkg/provider/tailscale/provider.go:253 > Fetched certificate for domain "myserver.yak-bebop.ts.net" providerName=myresolver.tailscale
2023-01-09T11:03:36+01:00 DBG ../../pkg/server/configurationwatcher.go:226 > Configuration received config={"http":{},"tcp":{},"tls":{},"udp":{}} providerName=myresolver.tailscale
2023-01-09T11:03:36+01:00 DBG ../../pkg/tls/certificate.go:158 > Adding certificate for domain(s) myserver.yak-bebop.ts.net

Enjoy your TLS route! You can now access your Tailscale hostname (https://myserver.yak-bebop.ts.net in the example) over HTTPS in your browser.

Tailscale as a tunnel between a Mac host and containers

The (convoluted) background

The Traefik project, just like most large software projects, has integration test suites that we run both on our development machines (mostly laptops), and automatically on our Continuous Integration (CI) platform when submitting a pull request, notably to detect regressions.

In our case, that usually means we use at least three components for a test: Traefik itself, a third-party component, like a backend (e.g. the traefik/whoami webserver), and the test itself, which can be mainly viewed as an (HTTP, or not) client that makes requests in Go.

For historic reasons, at some point, we ended up in a situation where all of these components would, by default, run in Docker. The rationale is the usual: you want reproducibility so that the setup is the same everywhere, and the test will run on the CI, as well as your laptop. And on your dev laptop, it also allows you to avoid the need to install and configure various third parties — think databases, for example.

However, on Mac machines, there are two (sort of interlinked) major drawbacks to that situation: slow run time, and inconvenient workflow.

When we are debugging, or working on a new feature, it is pretty important to be able to make a change, rerun one test in particular, and get some feedback in a decent amount of time. Otherwise, aside from being tedious, you're stuck in this in-between where there's not enough time to context-switch to something else, and it's not fast enough to stay in the ideal flow where you can keep on iterating. Having Traefik Proxy in Docker is an obstacle for two reasons. First, the Docker image has to be rebuilt for every little iteration you want to try, which is automated but is still somewhat slow. Second, it means the whole test setup and run is also way slower than it should be, especially on Mac, mostly because there's a Linux VM in between.

As the vast majority of the Traefik team is using Mac now, this has become an annoying enough problem that we wanted to take care of it. And that is when one of us nerd-sniped another into using Tailscale. Why, you ask? Because of the aforementioned Linux VM.

See, the best of both worlds would be to keep the third-party backends in Docker (for convenience), but take Traefik and the client code out of Docker. This means the clients, and Traefik, have to be able to reach the backends in their containers (and sometimes vice-versa). On Linux, it is somewhat doable, but on Mac, it gets considerably harder, given that there is a Linux VM in between the Mac host and the containers. So, we wanted to see if Tailscale allowed us to achieve that with minimal work.

The solution

The basic idea that is key to the solution is another nifty Tailscale feature, the subnet router. If a Tailscale node sits in a container that is in the same Docker network as all our other Docker containers, then it can reach all these containers. And since it is also part of our tailnet, it can also reach our Mac host (assuming that this host is also part of the tailnet, of course), acting as a gateway between both networks. For the more technically inclined, most of the changes related to that idea are in this commit.

The gist of it is that now we have a flag (IN_DOCKER environment variable) which conveys the intent whether to build Traefik Proxy and to run the tests in Docker or directly on the host.

If not true, we look for a tailscale.secret file, which should contain a Tailscale auth key (ephemeral, but reusable). We then start, with the docker compose API, a tailscale/tailscale container in the same Docker network as the other containers, in which we run Tailscale with the auth key, and --advertise-routes=172.31.42.0/24, in order to make it a subnet router for all the Docker containers.

Finally, for the gritty details on the Tailscale side, I want to mention two things:

  1. As seen above, you need to generate an ephemeral, reusable auth key, which can be done on your Keys page at https://login.tailscale.com/admin/settings/keys
  2. You need an autoApprovers section in the ACLs, in order to automatically approve the routes to the subnet relay. For our purposes, it looks like this:
"autoApprovers": {
		// Allow myself to automatically advertize routes for docker networks
		"routes": {
			"172.0.0.0/8": ["your_tailscale_identity"],
		},
	},

And that's pretty much it!

So even if the idea of using a VPN to communicate between your host and some containers in Docker seems like overkill, it actually works! And it does make our life simpler as it considerably improves the feedback loop for us when iterating on tests.

Is there a better solution? Probably.

Would it require considerably more changes to our tests setup? Maybe.

Was it fun to do? Definitely 😉

Don’t forget to check out the announcement of the Tailscale-Traefik Proxy integration on the Tailscale Blog and their official documentation.


This is a companion discussion topic for the original entry at https://traefik.io/blog/exploring-the-tailscale-traefik-proxy-integration/