Traefik Ingress times out when connecting from another host, works fine from localhost

I'm trying to get a Kubernetes Cluster to run on a single Redhat 7.7 server.

I've previously managed to get it to work on Centos 7 and a Redhat 7.7 AMI on AWS.

The Traefik HTTP Ingress Controller appears up and running, however all http requests timed out on the nodePort for traefik-ingress-controller-http-service.

Output of kubectl get services | grep traefik

At first I assumed that there was something wrong with the Ingress itself, but if you try to curl from inside the server it works fine.

To discard some sort of firewall issue I added a nodePort to some of my services and they can be accesed just fine.

A debug message appears on the log from the traefik-ingress-controller pod whenever I use curl inside the server:

level=debug msg="vulcand/oxy/roundrobin/rr: begin ServeHttp on request"

There are no debug messages for requests that time out.

After using netstat -anp I noticed that kube-proxy owns the port that I'm trying to use so I also took a look at the kube-proxy pod's log and compared with the log from my successful installation at the only difference is this line, which only shows on the failed server installation:

node.go:135] Successfully retrieved node IP: 192.168.215.172

Temporarily I've done a port forwarding and it works fine:

nohup kubectl port-forward --address 0.0.0.0 svc/traefik-ingress-controller-http-service 30225:443 -n traefik &

My versions are:

Kubernetes: 1.17.3 Traefik: 1.7

Traefik config:

apiVersion: v1
kind: ConfigMap
metadata:
  name: traefik-ingress-configmap
  namespace: traefik
data:
  traefik.toml: |
    defaultEntryPoints = ["https","http"]
    [entryPoints]
      [entryPoints.http]
      address = ":80"
      [entryPoints.https]
      address = ":443"
        [entryPoints.https.tls]
          [[entryPoints.https.tls.certificates]]
          CertFile = "/ssl/tls.crt"
          KeyFile = "/ssl/tls.key"
    [kubernetes]
      [kubernetes.ingressEndpoint]
        publishedService = "traefik/traefik-ingress-controller-http-service"
    [ping]
    entryPoint = "http"

Service:

---
kind: Service
apiVersion: v1
metadata:
  name: traefik-ingress-controller-http-service
  namespace: traefik
  annotations: {}
spec:
  selector:
    k8s-app: traefik-ingress-controller
  ports:
  - protocol: TCP
    port: 80
    name: http
  - protocol: TCP
    port: 443
    name: https
    nodePort: 30220
  type: NodePort

Traefik Deployment:

---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: traefik-ingress-controller
  namespace: traefik
  labels:
    k8s-app: traefik-ingress-controller
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: traefik-ingress-controller
  template:
    metadata:
      labels:
        k8s-app: traefik-ingress-controller
        name: traefik-ingress-controller
    spec:
      serviceAccountName: traefik-ingress-serviceaccount
      terminationGracePeriodSeconds: 35
      volumes:
        - name: traefik-ui-tls-cert
          secret:
            secretName: traefik-ui-tls-cert
        - name: traefik-ingress-configmap
          configMap:
            name: traefik-ingress-configmap
      containers:
      - image: traefik:v1.7
        name: traefik-ingress-controller
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 200m
            memory: 384Mi
          requests:
            cpu: 25m
            memory: 128Mi
        livenessProbe:
          failureThreshold: 2
          httpGet:
            path: /ping
            port: 80
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 5
        readinessProbe:
          failureThreshold: 2
          httpGet:
            path: /ping
            port: 80
            scheme: HTTP
          periodSeconds: 5
        volumeMounts:
          - mountPath: "/ssl"
            name: "traefik-ui-tls-cert"
          - mountPath: "/config"
            name: "traefik-ingress-configmap"
        ports:
        - name: http
          containerPort: 80
        - name: https
          containerPort: 443
        - name: dashboard
          containerPort: 8080
        args:
        - --logLevel=DEBUG
        - --configfile=/config/traefik.toml
        - --insecureskipverify

Any ideas are welcome :slight_smile:

I started tracking the tcp packages across all relevant network interfaces and I realized that the cluster IP of the traefik service which does the DNS lookup and balancing through DNAT and IP tables could not answer back the SYN initial package.

In this case I had to set the externalTrafficPolicy to Local to allow the Traefik HTTP Ingress Controller Pod to use the actual client IP to answer instead of the masked NAT IP / Port.

---
kind: Service
apiVersion: v1
metadata:
  name: traefik-ingress-controller-http-service
  namespace: traefik
  annotations: {}
spec:
  selector:
    k8s-app: traefik-ingress-controller
  ports:
  - protocol: TCP
    port: 80
    name: http
  - protocol: TCP
    port: 443
    name: https
    nodePort: 30220
  type: NodePort
  externalTrafficPolicy: Local