Looking for UDP config validation

Hello all:

I am trying to run netboot.xyz behind Traefik CE v3.1.2, but the TFTP protocol is not working.
I believe the Traefik configuration is correct, but I am looking for validation.

My environment is using Podman 5.1.2, Nomad 1.8.3, Consul 1.19.1.

Here is the Nomad definition for Traefik:

job "traefik" {
  datacenters = ["homelab"]
  type        = "service"

  group "traefik" {
    network {
      port "http" {
        static = 80
      }
      port "https" {
        static = 443
      }
      port "tftp" {
        static = 69
      }
    }

    service {
      name     = "traefik"
      port     = "https"
      tags = [
        "traefik.enable=true",
        "traefik.http.routers.dashboard.rule=Host(`traefik.example.com`)",
        "traefik.http.routers.dashboard.service=api@internal",
        "traefik.http.routers.dashboard.entrypoints=web,websecure",
        "traefik.http.routers.dashboard.tls.certresolver=internal",
        "traefik.http.routers.dashboard.tls=true",
      ] 

      check {
        name     = "alive"
        type     = "tcp"
        port     = "http"
        interval = "10s"
        timeout  = "2s"
      }
    }

    task "traefik" {
      driver = "podman"
      config {
        image = "docker.io/library/traefik:v3.1.2"
        ports = [
          "http", 
          "https",
          "tftp",
        ]
        
        args = [
          "--api.dashboard=true",
          "--log.level=DEBUG",
          "--accesslog=true",

          # Consul integration
          "--providers.consulcatalog=true",
          "--providers.consulcatalog.exposedByDefault=false",
          "--providers.consulcatalog.prefix=traefik",
          "--providers.consulcatalog.endpoint.address=${NOMAD_IP_http}:8500",

          # Internal ACME/PKI
          "--certificatesresolvers.internal.acme.caserver=https://ca.example.com/acme/acme/directory",
          "--certificatesresolvers.internal.acme.email=${NOMAD_SHORT_ALLOC_ID}@example.com",
          "--certificatesresolvers.internal.acme.storage=/local/internal.acme.json",
          "--certificatesresolvers.internal.acme.dnsChallenge=true",
          "--certificatesresolvers.internal.acme.dnschallenge.provider=rfc2136",
          "--certificatesresolvers.internal.acme.dnschallenge.resolvers=ww.xx.yy.zz:53",
          "--certificatesresolvers.internal.acme.dnschallenge.delayBeforeCheck=60",
          "--certificatesresolvers.internal.acme.certificatesduration=720",
          
          # HTTP entrypoints
          "--entrypoints.web.address=:${NOMAD_PORT_http}",
          "--entrypoints.websecure.address=:${NOMAD_PORT_https}",

          # Non-HTTP entrypoints
          "--entrypoints.netbootxyz.address=:${NOMAD_PORT_tftp}/udp",
        ]
      }
      
      artifact {
        source = "https://ca.example.com/roots.pem"
        mode   = "file"
      }

      env {
        LEGO_CA_CERTIFICATES   = "/local/roots.pem"
        RFC2136_TSIG_KEY       = "traefik-dns-challenge"
        RFC2136_TSIG_SECRET    = "<REDACTED>"
        RFC2136_TSIG_ALGORITHM = "hmac-sha256"
        RFC2136_NAMESERVER     = "ww.xx.yy.zz"
      }

      resources {
        cpu    = 100
        memory = 128
      }
    }
  }
}

Here is the Nomad definition for netboot.xyz:

job "netbootxyz" {
  datacenters = ["homelab"]
  type        = "service"

  group "netbootxyz" {
    network {
      port "ui" {
        to = 3000
      }
      port "tftp" {
        to = 69
      }
    }
    
    service {
    	name = "netbootxyz-ui"
      port = "ui"
      
      tags = [
        "traefik.enable=true",
        "traefik.http.routers.netbootxyz-ui.rule=Host(`netbootxyz.example.com`)",
        "traefik.http.routers.netbootxyz-ui.service=netbootxyz-ui",
      ]
    }
    
    service {
      name = "netbootxyz-tftp"
      port = "tftp"
      
      tags = [
        "traefik.enable=true",
        "traefik.udp.routers.netbootxyz-tftp.entryPoints=netbootxyz",
        "traefik.udp.routers.netbootxyz-tftp.service=netbootxyz-tftp",
      ]
    }

    reschedule {
      attempts       = 3
      delay          = "20s"
      delay_function = "exponential"
      interval       = "3m"
      unlimited      = false
    }

    task "netbootxyz" {
      driver = "podman"

      config {
        image      = "netbootxyz/netbootxyz:0.7.3-nbxyz1"
        ports      = ["ui", "tftp"]
      }

      resources {
        cpu        = 200
        memory     = 600
      }
    }
  }
}

Here is the port mapping on the netboot.xyz container:

podman container inspect netbootxyz-f0446b3f-4816-50f8-eb29-c0f52d0d387b | jq '.[].NetworkSettings.Ports'
{
  "3000/tcp": [
    {
      "HostIp": "10.1.18.13",
      "HostPort": "25857"
    }
  ],
  "3000/udp": [
    {
      "HostIp": "10.1.18.13",
      "HostPort": "25857"
    }
  ],
  "69/tcp": [
    {
      "HostIp": "10.1.18.13",
      "HostPort": "28631"
    }
  ],
  "69/udp": [
    {
      "HostIp": "10.1.18.13",
      "HostPort": "28631"
    }
  ],
  "80/tcp": null
}

In Traefik's log, I see the client is attempting to connect to the TFTP server:

2024-08-28T11:41:32.368542293-04:00 stdout F 2024-08-28T15:41:32Z DBG github.com/traefik/traefik/v3/pkg/udp/proxy.go:23 > Handling UDP stream from 172.16.100.116:7371 to 10.1.18.13:28631
2024-08-28T11:41:40.318793590-04:00 stdout F 2024-08-28T15:41:40Z DBG github.com/traefik/traefik/v3/pkg/udp/proxy.go:23 > Handling UDP stream from 172.16.100.116:7371 to 10.1.18.13:28631

The client attempts to connect and fails with a timeout error while trying to download the PXE boot file.
If I remove Traefik from the equation, everything will work fine.

I would imagine the end-to-end port routing should work:
client (7371/udp) --> Traefik (69/udp) --> container (28631/udp:69/udp)

HTTP is working just fine.

From a Traefik perspective, does the configuration above look correct?

Thanks

I would say the usage of your components is very low in this forum. My hypothesis:

Podman x Nomad x Consul = 0.01 x 0.01 x 0.01 = 0.000001

probability someone else is using the same combination here :sweat_smile:

Maybe try reddit.com/r/Traefik/