Router with automatic fallback when healthchecks fail

Hi, I'm trying to set up traefik on a VPS to redirect all traffic to an upstream reverse proxy (traefik). If the upstream traefik instance is unhealthy (fails healthchecks), the router should fall back to a status page on the downstream traefik host itself (uptime-kuma), which monitors the containers.

This is my current config.yml:

  middlewares:
    status-redirect:
      redirectRegex:
        regex: '^.+\.domain\.com(.*)'
        replacement: "https://status.domain.com/status/domain"
        permanent: false

  services:
    domain-ingress:
      loadBalancer:
        healthCheck:
          path: "/ping"
          interval: "10s"
          timeout: "3s"
          followRedirects: false
        servers:
          - url: "https://dns.upstream.com"
    
    domain-status:
      loadBalancer:
        healthCheck:
          path: "/"
          interval: "10s"
          timeout: "3s"
        servers:
          - url: "http://uptime-kuma:3001"

  routers:
    domain-status:
      entryPoints: 
        - websecure
      rule: Host(`status.domain.com`)
      service: domain-status
      priority: 30
      tls: {}
    
    ## main router
    domain-ingress:
      entryPoints: 
        - websecure
      rule: 'HostRegexp(`^.+\.domain\.com$`)'
      service: domain-ingress
      priority: 20
      tls: {}
    
    ## fallback router for failover
    domain-failover:
      entryPoints: 
        - websecure
      rule: 'HostRegexp(`^.+\.domain\.com$`)'
      service: domain-status
      middlewares:
        - status-redirect
      priority: 10
      tls: {}

I've set this up so that the clients browser redirects to the uptime monitor's status page under the URL status.domain.com/status/domain.

However, when the healthchecks to the upstream traefik instance fails, I simply get a 503 error when trying to access any URL, with a "no available server" message in the browser. According to the traefik config, priority routing based on healthchecks should work, no? What am I doing wrong?

1 Like

router rule matching happens before service healthcheck. So you can’t simply fall back to a different router if the service fails.

Maybe errors middleware can help (doc).

1 Like

Thank you for the feedback! Sadly, it seems traefik might not quite be able to achieve what I intended to do. I could use the errors middleware, but afaik I cannot then redirect the client browser to another URL. The other option I tried was to use the failover service option, but here as well there's no way to then redirect to another URL.

In the case of uptime-kuma if I can only route to the service, all the user sees is the login page. Instead of the status pages /status/example, which I cannot redirect to