Traefik unable to connect to Nomad Service Provider

Hello,

I have Traefik 2.8.4 running in a Nomad 1.3.5 cluster. Traefik is running on every client machine along with Nomad.

However, Traefik can't connect to Nomad.

time="2022-09-06T12:36:37Z" level=error msg="Provider connection error failed to load initial nomad services: Get \"http://127.0.0.1:4646/v1/services\": dial tcp 127.0.0.1:4646: connect: connection refused, retrying in 6.225073274s" providerName=nomad

The configuration is:
"--api.dashboard=true",
"--api.insecure=true", ### For Test only, please do not use that in production
"--entrypoints.web.address=:${NOMAD_PORT_http}",
"--entrypoints.traefik.address=:${NOMAD_PORT_admin}",
"--entryPoints.http.transport.lifeCycle.requestAcceptGraceTimeout=15s",
"--entryPoints.http.transport.lifeCycle.graceTimeOut=10s",
"--entryPoints.https.address=:443",
"--providers.nomad=true",
"--entryPoints.https.transport.lifeCycle.requestAcceptGraceTimeout=15s",
"--entryPoints.https.transport.lifeCycle.graceTimeOut=10s",
"--providers.nomad.endpoint.address=http://127.0.0.1:4646", ### IP to your nomad server

This is a very basic barebones configuration right now, as a proof of concept, but any help would be greatly appreciated.

I did create a Github issue for this initially, but was suggested to create a forum post for it:
Nomad Provider Not Working · Issue #9302 · traefik/traefik (github.com)

It's quite possible this might be similar to a previous issue with K8, I haven't found a solution yet though.

Traefik on k8s not listening externally without changing deployment · Issue #3951 · traefik/traefik (github.com)

@Tiffany as requested in the webinar

Also some further info:
systemd-r 1027 systemd-resolve 13u IPv4 19512 0t0 TCP 127.0.0.53:53 (LISTEN)
sshd 1969 root 3u IPv4 27960 0t0 TCP *:22 (LISTEN)
sshd 1969 root 4u IPv6 27962 0t0 TCP *:22 (LISTEN)
nomad 3519 nomad 3u IPv6 35262 0t0 TCP *:4647 (LISTEN)
nomad 3519 nomad 9u IPv6 32750 0t0 TCP *:4648 (LISTEN)
nomad 3519 nomad 19u IPv6 34247 0t0 TCP *:4646 (LISTEN)
docker-pr 19756 root 4u IPv4 557312 0t0 TCP 10.0.0.4:8080 (LISTEN)
docker-pr 19785 root 4u IPv4 554458 0t0 TCP 10.0.0.4:80 (LISTEN)

We have the network set as host and no luck.

Hey @Edstub207,

is nomad advertising on that network interface? It looks like, as Traefik is running in a container, it doesn't have access to the actual underlaying IP from the hosts.

So I'm wondering, if a) Traefik might not be able to actually reach those addresses or b) Nomad might not be listening on the localhost IP.

Hello @SantoDE

Yeah that's what I've been trying to work out as well, but everything looks fine from the Nomad side, as that's a wildcard IP and then the docker container I started with the host network functionality. So they should be able to talk to each other.

Have you seen anything similar to this during testing/development and what was done to fix it?

Hello @SantoDE, I managed to get Traefik connected today to Nomad.

Somehow, reducing Nomad to use only one client node, resolved the problem. I'm going to look into scaling back up to four clients this week.

I wonder if somehow it was getting confused when trying to run multiple Traefik and Nomad instances all connecting to each other (I had traefik running as a service, so it was on each client node).

Hi,
I have tried following the tutorial (Traefik Proxy Integrates with Hashicorp Nomad | Traefik Labs) with the following "modifications" to get it to work in a minimal dev setup:

  • used nomad v1.3.5
  • start nomad agent with: sudo nomad agent -dev -bind 0.0.0.0 -log-level INFO
  • set treafik image to: traefik:2.8 (also tried traefik:2.9)
  • tried various versions of the line: "--providers.nomad.endpoint.address=... including commenting it out and setting it to the default

When running through the tutorial traefik does not manage to connect to the nomad service discovery, nomad alloc logs f671f.... gives:

time="2022-09-22T15:46:27Z" level=error msg="Provider connection error failed to load initial nomad services: Get "[http://127.0.0.1:4646/v1/services\](http://127.0.0.1:4646/v1/services\)": dial tcp 127.0.0.1:4646: connect: connection refused, retrying in 15.401737866s" providerName=nomad

So traefik never manages to detect the whoami-demo service. Any ideas of what the issue can be? / what I should set providers.nomad.endpoint.address for local development?

(I asked the same question in Traefik Proxy Integrates with Hashicorp Nomad | Traefik Labs - #6 by joh4n)

Hello, quite similar issue,

Here is the output with debug enabled

time="2022-09-28T00:26:46Z" level=error msg="Failed to build HTTP service configuration: address is missing" providerName=nomad
time="2022-09-28T00:26:46Z" level=debug msg="Configuration received: {\"http\":{\"routers\":{\"traefik-http\":{\"service\":\"traefik-http\",\"rule\":\"Host(`traefik-http`)\"}},\"services\":{\"traefik-http\":{\"loadBalancer\":{\"servers\":[{\"url\":\"http://127.0.0.1:80\"}],\"passHostHeader\":true}}}},\"tcp\":{},\"udp\":{}}" providerName=nomad

Hey everbody, same problem here.
I am on vagrant, building a server and a client.

time="2022-10-02T07:37:19Z" level=debug msg="Skipping unchanged configuration." providerName=nomad
time="2022-10-02T07:37:49Z" level=error msg="Failed to build HTTP service configuration: address is missing" providerName=nomad

This is my traefik configuration, the dashboard is working correctly, I do have a wildcard certificate.

job "traefik" {
  datacenters = ["dc1"]
  type        = "service"

  affinity {
    attribute = "${node1}"
    value     = "dc1"
    weight    = 100
  }

  group "traefik" {
    count = 1


    network {
      port  "http"{
         static = 80
      }
      port  "https"{
         static = 443
      }
      mode="host"
    }

    service {
      name = "traefik"
      provider = "nomad"
      port = "http"
      tags = [
                "traefik.enable=true",
                "traefik.http.middlewares.traefik-auth.basicauth.removeheader=true",
                "traefik.http.middlewares.traefik-auth.basicauth.users=proxyadmin:$apr1$D3l5n6X2$DzDfIQiqSFCDF4J4qZyqK.",
                "traefik.http.routers.traefik.rule=Host(`traefik.domain.com`)",
                "traefik.http.routers.traefik.service=api@internal",
                "traefik.http.routers.traefik.tls=true",
                "traefik.http.routers.traefik.middlewares=secHeaders@file"
        ]
    }

    task "traefik" {
      driver = "docker"
      

      config {
        image = "traefik:2.9"
        ports = ["http","https"]
        volumes = [
          # Use absolute paths to mount arbitrary paths on the host
  
          "/container/traefik/config/traefik.yml:/etc/traefik/traefik.yml:ro",  # static traefik configuration
          "/container/traefik/config/dynamic.yml:/etc/traefik/dynamic.yml:ro",  # dynamic traefik configuration
          "/container/traefik/certs:/etc/traefik/certs/"
        ]       

      }
    }
  }
}

This is my traefik.yml

# Logging levels are DEBUG, PANIC, FATAL, ERROR, WARN, and INFO.
log:
  level: DEBUG 

Provider configuration
providers:
  nomad:
    endpoint:
      address: http://172.20.0.70:4646
    namespace: "default"
    refreshInterval: 30s
  file:
    filename: /etc/traefik/dynamic.yml
    watch: true


# if you don't need the dashboard disable it
api:
  dashboard: true 

# Entrypoints configuration
entryPoints:
  web:
    address: ':80' # http
    http:
      redirections:
        entryPoint:
          to: web-secure
          scheme: https
  web-secure:
    address: ':443' # https

If I do spin up a new job in nomad, traefik do not intercept it.

I can provide more info about, anyway I am still trying to figure out what this exactly means

time="2022-10-02T07:37:19Z" level=debug msg="Skipping unchanged configuration." providerName=nomad
time="2022-10-02T07:37:49Z" level=error msg="Failed to build HTTP service configuration: address is missing" providerName=nomad

Ok, I got my second service working, in the nomad job configuration you need to specify the port used by the service.

Here is the nomad job for my second service, youtrack
I'll post the entire job, it might be usefull to someone.

job "youtrack" {
  datacenters = ["dc1"]
  type        = "service"


  group "youtrack" {
    count = 1

   network {
       port  "8080"{
          static = 8080
       }

     }

    service {
      name = "youtrack"
      provider = "nomad"
      port = "8080"
      tags = [
        "traefik.enable=true",
        "traefik.http.routers.youtrack.rule=Host(`youtrack.domain.com`)",
        "traefik.http.routers.youtrack.tls=true" ,
        "traefik.http.routers.youtrack.middlewares=customHeaders@file",
        "traefik.http.services.youtrack.loadbalancer.server.port=8080"
        ]
    }

    task "youtrack" {
        driver = "docker"
          
        resources {
            cpu    = 500
            memory = 1000 
        }
    
      

      config {
        image = "jetbrains/youtrack:2022.2.51283"
        ports = ["8080"]
        volumes = [
          # Use absolute paths to mount arbitrary paths on the host
            "/container/youtrack/data:/opt/youtrack/data",
            "/container/youtrack/conf:/opt/youtrack/conf",
            "/container/youtrack/logs:/opt/youtrack/logs",
            "/container/youtrack/backups:/opt/youtrack/backups"
        ]      

        }
    }
  }
}

Anyway, I am stil gettin theese

time="2022-10-02T08:33:42Z" level=debug msg="Skipping unchanged configuration." providerName=nomad
time="2022-10-02T08:34:12Z" level=debug msg="Configuration received: {\"http\":{\"routers\":{\"traefik\":{\"middlewares\":[\"secHeaders@file\"],\"service\":\"api@internal\",\"rule\":\"Host(`traefik.montecatone.com`)\",\"tls\":{}}},\"services\":{\"traefik\":{\"loadBalancer\":{\"servers\":[{\"url\":\"http://172.20.0.80:80\"}],\"passHostHeader\":true}}},\"middlewares\":{\"traefik

Are theese by design?