TCP Failover for Master/Backup scenario

chrisbrookes · October 8, 2019, 4:09pm

I'm trying to set up a configuration where I have two SFTP servers deployed and use one as the master, and when that fails, failover to the backup. The backup in this case is just another SFTP server perhaps with some replication from the master one. The backup is never used unless the master fails.

I've tried a WWR set up with the backup server's weight set to 0 (see config below). This doesn't seem to work though. When I disable the master server, trying to connect through the Traefik route gives a connect: no route to host error as it's still trying to connect to the master server. I don't think it's possible to set up health checks for TCP services so I don't think that can help me.

Is this even possible?

[entryPoints]

  [entryPoints.http]
    address = ":80"
  [entryPoints.https]
    address = ":443"
  [entryPoints.sftp]
    address = ":22"

Dynamic:

# Dynamic configuration for test. Managed by Ansible.

[tcp]
  [tcp.routers]
      [tcp.routers.test_at_sftp]
      entryPoints = ["sftp"]
      rule = "HostSNI(`*`)"
      service = "test"

  [tcp.services]
    #-------------------------------------------------------------------------------------------------------------------
    # Weighted Round Robin services
    #-------------------------------------------------------------------------------------------------------------------
    [tcp.services.test]
        [[tcp.services.test.weighted.services]]
        name = "local_sftp"
        weight = 1

        [[tcp.services.test.weighted.services]]
        name = "backup-sftp"
        weight = 0

    [tcp.services.local_sftp]
      [tcp.services.local_sftp.loadBalancer]
        [[tcp.services.local_sftp.loadBalancer.servers]]
          address = "sftp:22"

    [tcp.services.backup-sftp]
      [tcp.services.backup-sftp.loadBalancer]
        [[tcp.services.backup-sftp.loadBalancer.servers]]
          address = "centos-sftp2:2200"

Thanks,
Chris.

ldez · October 8, 2019, 6:53pm

Hello,

A weight 0 means that the server is disabled.

wuppi · January 24, 2020, 2:48pm

Hi,
I'm trying to build a TCP loadbalancer as well which should do a failover if one of the backend systems stopps working. Did you solve your problem and how?
Thanks,
Thomas

chrisbrookes · January 24, 2020, 8:04pm

Hi Thomas,

I don't think it's possible with the current capabilities of Traefik. Certainly, nobody on here has piped up to tell me any different

We've left our set up as manual failover for now. But I've been thinking about solutions involving our other systems. We use Zabbix (monitoring), Jenkins (CI and operations jobs) and Ansible (configuration provisioning - this is how we roll out and configure Traefik). I was thinking of linking those together:

Zabbix alert after master fail detected -> webhook call -> Jenkins -> Job that runs Ansible -> Ansible re-configures Traefik TCP config to switch the slave/backup to master.

Which is relatively complicated compared to Traefik just having failover similar to the HTTP side of things.

Hope that helps,
Thanks,
Chris.

wuppi · January 27, 2020, 8:53am

Hi Chris,
thank you, that helpde a lot and that's what I expected ;-( We are thinking about creating a "provider" for traefik which should be able to dynamically change the configuration... hope that works...

Thank you,
best regards
Thomas

capnjosh · February 28, 2020, 2:00am

@chrisbrookes, you can achieve this failover goal if you take a "router priority" approach, rather than a service weighted round robin approach.

Instead of using weightings with services, create 2 routers and use the priority=2 on the router that routes to the primary host, and use a priority=1 on the router for the failover (all other configs can be the same). That way, all traffic will get routed to the primary, and then when it disappears, traffic will get routed to the secondary. I've got this working with ConsulCatalog for some regional endpoints that can failover to any service globally if there are no valid services in the same region.

On the topic of service weightings, I was wondering if I could take the same approach as you did with the weightings, since I suspect it would allow me to have a little more control over the order in which regional failovers could be defined...

For example, if weight values of less than zero meant the server won't receive traffic if there are services with weight values greater than zero, and if there are no servers with weight values greater than zero, the services with the largest negative value would receive traffic, then I think we could do some interesting failover scenario support. But that's beside the point.

Topic		Replies	Views
Struggling with failover setup Traefik v2 file	4	1416	November 21, 2019
Traefik simple failover/fallback server Traefik v2 docker , file	1	1678	July 1, 2021
Failover / backup server functionality Traefik v1 file	0	980	November 11, 2019
Does TCP failover work with traefik 2.1? Traefik v2 tcp	0	506	January 24, 2020
Backup server configuration Traefik v2 file	7	1833	April 9, 2020

TCP Failover for Master/Backup scenario

Related topics