Traefik intercepts TLS challenge in nested architecture (with TLS passthrough)

Hi, first let me describe the infrastructure:

  1. Top level Traefik with file provider and TCP routers defined (with TLS passthrough enabled), routing traffic based on HostSNIRegex to multiple lower level Traefik instances
  2. Lower level Traefik using docker provider, handling subdomains and their SSL certificates using TLS Challenge

This has worked perfectly for a long time for me, until recently - after upgrading to V3 (maybe even latest V2 releases) I started having the following issue:

  • Top level Traefik seems to intercept the TLS challenge, following message is recorded in log:

e[90m2024-05-31T09:48:55Ze[0m e[33mDBGe[0m e[1mgithub.com/traefik/traefik/v3/pkg/tls/tlsmanager.go:196e[0me[36m >e[0m TLS: no certificate for TLSALPN challenge: my.domain

  • Lower level Traefik certificate renewal (or new cert generation) subsequently fails with the following message:

2024-05-31T09:49:01Z ERR Error renewing certificate from LE: {my.domain } error="error: one or more domains had a problem:\n[my.domain] acme: error: 400 :: urn:ietf:params:acme:error:tls :: my.public.ip.address: remote error: tls: unrecognized name\n" acmeCA=https://acme-v02.api.letsencrypt.org/directory providerName=acme-prod.acme

I found the following 2 workarounds help to overcome this issue:

  1. either Disable top level Traefik and route traffic manually to specific lower level Traefik
  2. or downgrade top level Traefik to an older version (verified this issue doesn't exist in v2.8.8)

I'm running Top Level Traefik in a docker container on my Mikrotik Router, is there a way to disable the TLS Challenge provider using an ENV or command switch?

Interesting approach. I thought that a tlsChallenge only works with the first point of contact, is not forwarded. So in a test setup we used httpChallenge for secondary Traefik.

You want each Traefik to get their own valid TLS for the same domain?

When using HostSNI(), Traefik requires a TLS cert to decrypt the domain. When you enable passthrough, TLS will be activated AFAIK. When no LE is used, Traefik will automatically create a custom cert.

I had a similar setup with Traefik v2.10.7 working. With later versions I also get the remote error: tls: unrecognized name.

The main Traefik instance is passing through traffic based on subdomain. The below piece of toml config is working with v2.10.7. It is not working with the latest v2.11.6 and also not with v3.0.4.

Update 1: there are 3 secondary Traefik instances all up to date with latest Traefik 3.x installed.

tcp:
  routers:
    services-secure:
      entryPoints:
        - "websecure"
      rule: "(HostSNIRegexp(`{subdomain:[a-zA-Z0-9-]+}.services.mydomain.com`))"
      service: "services-secure"
      tls:
        passthrough: true
  services:
    services-secure:
      loadBalancer:
        servers:
          - address: "services:443"
        proxyProtocol:
          version: 2

Update 2: the secondary Traefik instances are configured with "--certificatesresolvers.mytlschallenge.acme.tlschallenge=true" and running in different Docker environments.

Hi, I'm having the exact same issue right now. Our second level traefik instances cant renew their certificates anymore.
This used to work flawlessly. Is there a Github issue yet?

I didn't find any related issue yet. I opened a new one: Nested traefik instances with "remote error: tls: unrecognized name" · Issue #10880 · traefik/traefik · GitHub

There was a nice person going a bit deeper in the Github issue. For anybody finding this conversation here first some info: