Trying out NGINX compatibility

We currently use the soon-to-be-retired NGINX controller and so I’m evaluating alternatives, especially anything that claims some sort of built in compatibility to ease the migration.

We use EKS so currently have the NGINX Ingress controller installed with its Service annotated so there is an AWS NLB in front. We also use External DNS which updates a Route 53 zone with records that point at the NLB FQDN.

I’ve deployed Traefik and enabled the NGINX provider with the defaults and also added similar Service annotations so there is now a new separate NLB pointing at the Traefik Pods however External DNS still points all DNS records at the NGINX NLB.

I found the providers.kubernetesIngressNginx.publishStatusAddress option in the Helm chart and so set that to the FQDN of the new NLB, now the Ingress addresses are pointing at the new NLB and External DNS updates all of the records in the Route 53 zone. I notice that now both NGINX and Traefik fight over updating the Ingress addresses, but as long as Traefik wins by the time External DNS runs it shouldn’t be a problem.

The problem I’m left with is more often than not I now get an HTTP 418 response for any Ingress, but every so often it starts serving the correct content but then starts returning 418 again. I’ve enabled the access logs so I can see Traefik is correctly serving the successful responses; it’s not flipping back to NGINX again.

I’ve enabled the Traefik dashboard so I can see all of the HTTP Routers that have been gleaned from the NGINX Ingresses; there are lots of noop@internal ones which have the same priority and rule syntax as a HTTP or HTTPS service and I believe are what are serving the HTTP 418 responses. Am I right in thinking if there are two routers with the same rule and priority then Traefik is balancing between them which is why I’m getting a mix of the correct content and HTTP 418 responses?

If so, I don’t know where these are coming from. Any ideas?

1 Like

I’ve also tried this. And you are right http to https work but on https i have a teapot page. And i don’t know how to handle this. This issue block our migration to treafik.

I’m also trying out nginx compatiblity. But here’s my shot at it.

I notice that now both NGINX and Traefik fight over updating the Ingress addresses

I wonder if this could be becaues of the ingressclass you are using? I had a similar issue with ingress pointing to nginx ip and not traefiks ip when I did not use the correct ingress class.

Perhaps check if a traefik ingress class was created, and if your ingresses are using that name in the spec.ingressClassNamefeild

$ kubectl get ingressclass
NAME      CONTROLLER                      PARAMETERS   AGE
traefik   traefik.io/ingress-controller   <none>       5h14m

The ingress class hasn’t changed, that’s the point of the NGINX compatibility. All of my Ingress resources are still using their original nginx class.

There is a new Traefik IngressClass created and if I reconfigure a given Ingress to use this class instead then it works correctly without any HTTP 418 responses, but then that defeats the point of this NGINX compatibility provider in the first place.

I’ve found that setting ingressClass.name to nginxwill make traefik name to nginx just like nginx-ingress controller.

@bodgit - If you don’t mind, we also have exact setup like yours - NGINX Ingress controller installed with its Service annotated so there is an AWS NLB in front. We also use External DNS which updates a Route 53 zone with records that point at the NLB FQDN.

Can you share your values.yaml file that you used to install traefik with nginx provider. It will be really helpful

I’m also trying to use Traefik as a replacement for NGINX Ingress Controller. Our setup is a bit different, as we run RKE2 v1.34.3 (that bundles Traefik v3.6.4) on a set of VMs in GCP.

While trying it out via Google’s internal load balancer I also got random mix of 418 I’m a teapod and 401 Unauthorized (our expected response on test service).
I tried connecting to each of the worker nodes separately - my finding was that 4 of the 5 nodes serve correct responses (401) and the remaining 1 always responds with 418 I’m a teapod.
HTTP 418 responses are not logged in rke2-traefik pod with audit logs enabled.
I tried draining and rebooting the faulty node, but it didn’t help.

Then I took a closer look at the ingress object I use for testing. It is default ingress generated by Harbor helm chart.

  kind: Ingress
  metadata:
    annotations:
      ingress.kubernetes.io/proxy-body-size: "0"
      ingress.kubernetes.io/ssl-redirect: "true"
      nginx.ingress.kubernetes.io/proxy-body-size: "0"
      nginx.ingress.kubernetes.io/ssl-redirect: "true"
    labels:
      app: harbor
  <...>

by cross-referencing with Traefik docs and trial and error I arrived at this configuration:

  kind: Ingress
  metadata:
    annotations:
      nginx.ingress.kubernetes.io/ssl-redirect: "true"
    labels:
      app: harbor
  <...>

that makes my "faulty" node respond with correct 401 like the others. If I add even one of the removed annotations, we're back to 418 I'm a teapod.
My assumption is that unsupported annotations break something instead of being ignored silently.

tl;dr Seems like using annotation unsupported by Traefik in Ingress object makes one traefik pod respond with HTTP 418 without logging it.

I presume your ingresses likely have some annotations related to AWS load balancer, which may trigger the same behavior.

If anyone from Traefik team is here, please let me know if I should post these findings as an issue on main github or somwhere else.

Appendix

My HelmChartConfig for Traefik:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-traefik
  namespace: kube-system
spec:
  failurePolicy: reinstall
  valuesContent: |-
    ingressRoute:
      dashboard:
        enabled: true
        matchRule: Host(`dashboard.localhost`)
        entryPoints: ["websecure"]
    logs:
      general:
        level: "DEBUG"
      access:
        enabled: true
        format: "json"
    providers:
      kubernetesIngressNginx:
        enabled: true
      kubernetesGateway:
        enabled: true

relevant part of Harbor values.yaml:

expose:
  type: ingress
  tls:
    enabled: true
    certSource: secret
    secret:
      secretName: harbor-tls
  ingress:
    hosts:
      core: registry.local
    controller: default
    className: "nginx"

externalURL: https://registry.local

entirety of generated Ingress object:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
      ingress.kubernetes.io/proxy-body-size: "0"
      ingress.kubernetes.io/ssl-redirect: "true"
      nginx.ingress.kubernetes.io/proxy-body-size: "0"
      nginx.ingress.kubernetes.io/ssl-redirect: "true"
  labels:
    app: harbor
    app.kubernetes.io/instance: harbor
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: harbor
    app.kubernetes.io/part-of: harbor
    app.kubernetes.io/version: 2.13.1
    chart: harbor
    heritage: Helm
    release: harbor
  name: harbor-ingress
  namespace: harbor
spec:
  ingressClassName: nginx
  rules:
  - host: registry.local
    http:
      paths:
      - backend:
          service:
            name: harbor-core
            port:
              number: 80
        path: /api/
        pathType: Prefix
      - backend:
          service:
            name: harbor-core
            port:
              number: 80
        path: /service/
        pathType: Prefix
      - backend:
          service:
            name: harbor-core
            port:
              number: 80
        path: /v2/
        pathType: Prefix
      - backend:
          service:
            name: harbor-core
            port:
              number: 80
        path: /c/
        pathType: Prefix
      - backend:
          service:
            name: harbor-portal
            port:
              number: 80
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - registry.local
    secretName: harbor-tls

If you think it’s a bug, try Traefik Github.