How to use Ingress with Traefik 2.0

@dduportal yeah make sense.

Will switch to nginx

Thanks for trying to reproduce my error.

Traefik v2 is deployed as a daemonset. As this is my cluster for testing it is not bad if things are unavailable for some time. So Traefik v1 is replaced by v2 while I am testing. Traefik vX is the only ingress-controller in use.

Config of Traefik:

(Important part of the daemonset-definition):
spec:
  selector:
    matchLabels:
      k8s-app: traefik-ingress-lb
  template:
    metadata:
      creationTimestamp: null
      labels:
        k8s-app: traefik-ingress-lb
        name: traefik-ingress-lb
    spec:
      containers:
      - args:
        - --configfile=/config/traefik.yaml
        image: traefik:v2.0.5
        livenessProbe:
          failureThreshold: 2
          httpGet:
            path: /ping
            port: 9090
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 1
        name: traefik-ingress-lb
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 443
          name: https
          protocol: TCP
        - containerPort: 7070
          name: metrics
          protocol: TCP
        - containerPort: 9090
          name: ping
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /ping
            port: 9090
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 300m
            memory: 150Mi
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
       volumeMounts:
        - mountPath: /config
          name: config
        - mountPath: /ssl
          name: ssl
      volumes:
      - configMap:
          defaultMode: 420
          name: traefik-conf
        name: config
      - name: ssl
        secret:
          defaultMode: 420
          secretName: traefik-web-ui
---
apiVersion: v1
data:
  traefik.yaml: |
    global:
      checkNewVersion: false
      sendAnonymousUsage: false
    log:
      level: DEBUG
    providers:
      kubernetesCRD: {}
      kubernetesIngress: {}
    entryPoints:
      ping:
        address: ":9090"
      web:
        address: ":80"
      websecure:
        address: ":443"
      metrics:
        address: ":7070"
    ping:
      entryPoint: "ping"
    metrics:
      prometheus:
        entryPoint: metrics
    api:
      dashboard: true
    tls: (this setting is ignored according to the logs)
      stores:
        default:
          defaultCertificate:
            certFile: /ssl/tls.crt
            keyFile: /ssl/tls.key
    http:
      routers:
        api:
          rule: Host(`traefik.mydomain.com`)
          entrypoints:
            - websecure
          service: api@internal
          middlewares:
            - auth
          tls: {}
      middlewares:
        auth:
          basicAuth:
            users:
              - 'test:somepassword'

The dashboard is not working, but I guess I could fix that by not defining it in the config. But that is a very secondary problem and I can fix that later.

Also, you can find the logs here:
https://pastebin.com/vmEy3fAb

Mmmh I see a few things to fix or think about here:

  • There is something weird: the logs does not correspond to the configuration provided (there is no mention of kubernetesCRD provider while it's on the configuration). As it is a DaemonSet, there might be a propagation issue in the cluster. Because of this, we cannot be sure if the same Traefik DaemonSet's instance is answering the whoami request, handling the events from Kubernetes, etc.. Also, there are no mention of whoami in the log: so if the traefik instance your retrieved the logs from handled the requests, it's expected to have a 404 as no event from Kubernetes was sent to this Traefik about whoami.

=> To solve this, would you mind switching Traefik to a single replica (either deployment, or keep the daemonset and restrict it to a single node), so we are sure that a single pod is handling events and routing? Of course it is a temporary measure and you'll go back to your initial setup after we solved your issue.

=> To solve this, you have to enable the file provider (ref. https://docs.traefik.io/v2.0/providers/file/). The de-facto pattern is to define a file dynamic.yml (or toml) which contains the TLS, router, service object and point it to the file provider, or migrate the objects to another provider (or it could be CRDs associated to your Traefik deployment but let's stay with your current file to minimize the changes for now).
Let's make it simple and keep everything in the file traefik.yaml for now: change the section provider: in the configmap to the following:

  providers:
    kubernetesCRD: {}
    kubernetesIngress: {}
    file:
      filename: "/config/traefik.yaml"

Don't forget to kill your pods to be sure that the configmap changes are taken in account (as we changed the static configuration AND Kubernetes does not guarantee full propagation of configmaps synchronously)

  • Could you also enable the access logs so we can see if the requests are correctly handled by Traefik from its logs? It's easy, add the instruction accessLog: {} to the traefik.yaml file (and kill the pods...).

After these changes, you might check:

  • That the dashboard is available now?
  • Is whoami available now?

If no, I'll need the logs of the only Traefik pod AND the kubectl get svc,pods,ingress in the namespace where you deployed whoami.

Thanks a lot for the insights and tips.
We are making progress here I think :slight_smile:

The certificate gets accepted now and works. Also, the webinterface works.

whoami and every other ingress is still not working.
I also deployed Traefik as an deployment with only one replica this time.

Logs: https://pastebin.com/sU80Q92T

kubectl output:

NAME                                                        READY   STATUS      RESTARTS   AGE
pod/whoami-57bc4b5cfc-9zrh9                                 1/1     Running     0          6m46s
pod/whoami-57bc4b5cfc-fkm4j                                 1/1     Running     0          6m46s
pod/whoami-57bc4b5cfc-mrn97                                 1/1     Running     0          6m46s

NAME                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
service/whoami                        ClusterIP   10.233.31.222   <none>        8000/TCP                 6m46s

NAME                                       HOSTS                                                     ADDRESS   PORTS     AGE
ingress.extensions/test-ingress            whoami.mydomain.com                                                 80        6m46s

Ok, so now, Traefik picked the ingress and added the routing configuration: I got the same kind of logs.

What are the results of the following commands, run from outside the Kubernetes cluster:

curl -v http://traefik.mydomain.com

and

curl -v http://traefik.mydomain.com

?

[edit]

sorry I meant curl -v http://whoami.mydomain.com for the 2nd one

Those commands are the same.

Anyways, http makes no sense for me, since HAProxy redirects to https:

curl -v https://traefik.mydomain.com
*   Trying internal-ip:443...
* Connected to traefik.mydomain.com (internal-ip) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.mydomain.com
*  start date: Apr  9 14:40:20 2019 GMT
*  expire date: Apr  9 14:40:20 2021 GMT
*  subjectAltName: host "traefik.mydomain.com" matched cert's "*.mydomain.com"
*  issuer: some ca
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: traefik.mydomain.com
> User-Agent: curl/7.66.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 401 Unauthorized
< Content-Type: text/plain
< Www-Authenticate: Basic realm="traefik"
< Date: Fri, 22 Nov 2019 14:31:49 GMT
< Content-Length: 17
< 
401 Unauthorized

Sorry I meant curl -v http://whoami.mydomain.com for the 2nd one

No problem.

curl -v https://whoami.mydomain.com
*   Trying internal-ip:443...
* Connected to whoami.mydomain.com (internal-ip) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.mydomain.com
*  start date: Apr  9 14:40:20 2019 GMT
*  expire date: Apr  9 14:40:20 2021 GMT
*  subjectAltName: host "whoami.mydomain.com" matched cert's "*.mydomain.com"
*  issuer: some issuer
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: whoami.mydomain.com
> User-Agent: curl/7.66.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Fri, 22 Nov 2019 14:35:51 GMT
< Content-Length: 19
< 
404 page not found

So you have HTTPS. Is HAProxy terminating TLS, or do you expect Traefik to do it?

HAProxy terminates https, but also uses https for the backends.

OK, this is why there is a 404/HTTP: by terminating TLS for HTTP, Haproxy has to create a new HTTP client <-> server connection to Traefik.
So it rewrites the Host header by setting it to the IP (or domain) it uses to reach Traefik (I suppose it is the IP of one of the nodeports).
When the request hits Traefik, then it does not match the domain name, so answers 404/HTTP.

Did the setup was exactly the same with Traefikv1? I doubt this could have worked as it before.

=:> Can you validate my assumption by removing the host: whoami.mydomain.com from the Ingress and try again?

I have doubts about your theory.
This setup works perfectly fine in production with Traefik 1.7.19.
My theory is, that this works with v1 because all the traffic for applications alway comes in on port 443 of Traefik from where it gets forwarded to the right pods. That is because of the default-entrypoints setting in Traefik v1.
Traefik v2 just uses all entrypoints.
While testing around and going through the webui, I saw that some routers have a green sign next to them now. So I visited the corresponding URL's. And voila, some ingresses work now, some not.
Why they work puzzles me. But I noticed that the services of the working ingresses are listening on ports 80 and 4443, which leads me to the theory that Traefik somehow can map those correctly to web and websecure? I am just wildly guessing here.

Anyways, I removed the host directive and get the same result:

curl -v https://whoami.mydomain.com
*   Trying internal-ip:443...
* Connected to whoami.mydomain.com (internal-ip) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.mydomain.com
*  start date: Apr  9 14:40:20 2019 GMT
*  expire date: Apr  9 14:40:20 2021 GMT
*  subjectAltName: host "whoami.mydomain.com" matched cert's "*.mydomain.com"
*  issuer: some issuer
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: whoami.mydomain.com
> User-Agent: curl/7.66.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Fri, 22 Nov 2019 14:56:11 GMT
< Content-Length: 19
< 
404 page not found

I don't have anything else to say as I can't reproduce on both my local k3s and a remote EKS Kubernetes cluster with both HAProxy in front so there is for sure somethign that I do not understand :slight_smile:

If you think it's because of the entrypoints, then so be it. Then it means better staying on Traefik v1.7 or changing Ingress Controller, as your Kubernetes cluster does not seem to work with Traefik v2.