Potential race condition when migrating ingress-nginx to traefik

megabreit · February 25, 2026, 9:59pm

After reading Use ingress-nginx Resources in Traefik | Traefik Labs i tried this method with version 3.6.7 (bundled with RKE2). It is working fine so far if the ingressClassName is set to nginx. Both Ingress controllers are using the same Ingress at the same time. Now I compare this with the “official” migration documentation from Suse (sadly not available in public). They suggest a different way because there could be race conditions between the both ingress controllers trying to update the same ingress, and to avoid that, they set a different (non-empty) ingressClassName for the compatibility mode in their Helm values. In particular they set providers.kubernetesIngressNginx.ingressClass="rke2-ingress-nginx-migration". This is also working, but involves a downtime to switch over the traffic to the new loadbalancer. I like the other method better, because with it a real seamless migration would be possible. It’s just necessary to switch the ports of the existing loadbalancer service. But I can’t really judge the danger of such a configuration. In my tests I didn’t see a race condition in the logs, but this doesn’t mean that there is none, maybe I just don’t see it with the current log level? Has anybody here some insights about that?

emile · May 19, 2026, 9:38am

Hi @megabreit, thanks for raising this. The race can be real, and SUSE is right to flag it.
Here's what's actually going on under the hood.

Where the race lives

Both controllers reconcile against the same set of Ingresses (same ingressClassName: nginx) and both write back to status.loadBalancer.ingress[] to publish the LoadBalancer IP they front. Concretely on Traefik's side, the write happens in pkg/provider/kubernetes/ingress-nginx/kubernetes.go (updateIngressStatus) and is triggered whenever providers.kubernetesIngressNginx.publishService or publishStatusAddress is set. The provider does skip the update when the observed status already equals its target (isLoadBalancerIngressEquals), but as long as ingress-nginx is also publishing its own (different) IP, the two controllers keep overwriting each other in a tight loop.

It usually doesn't show as an error in the logs, just repeated Updated ingress status info lines on both sides. So your "I didn't see it in the logs" matches what we'd expect.

Why it matters beyond ExternalDNS

The flapping status.loadBalancer.ingress[] affects anything that watches the Ingress status: ExternalDNS (the most obvious one, and what our migration guide already calls out), kube-state-metrics, ArgoCD/Flux which will keep showing the resource as out-of-sync, dashboards, and custom operators. Routing itself is unaffected (both controllers happily route traffic in parallel during the window), but the observability side gets noisy.

Two clean ways to avoid it

Disable the publish on Traefik during coexistence (the path our migration guide recommends for ExternalDNS users): set
```
providers:
  kubernetesIngressNginx:
    publishService:
      enabled: false
```
Traefik will keep serving the Ingresses normally; it just stops writing the status. ingress-nginx remains the sole writer. Re-enable publishService on Traefik after you've uninstalled ingress-nginx. Full walkthrough: see the "ExternalDNS Users" block in the migration guide.
Use a transitional IngressClass (SUSE's approach on RKE2): give the migrating ingress-nginx a distinct class like rke2-ingress-nginx-migration so it and Traefik never own the same resource at the same time. This is cleaner operationally and is what we'd recommend specifically on RKE2 where SUSE already wires this up for you.

Both approaches work; pick the one that fits your cluster ops better.

On the docs

The race-on-status was mentioned in our migration guide but tucked inside a collapsed "ExternalDNS Users" note, which under-sold it. The issue is broader than just ExternalDNS, and the keyword "race condition" did not appear, which made it hard to find when you go looking. I've opened traefik/traefik#13205 to fix that: the note is promoted to a visible warning, the affected field is named explicitly, the impact list is broadened (kube-state-metrics, ArgoCD/Flux, dashboards, custom operators), and the SUSE transitional-class method is added as a second mitigation option. Thanks for the nudge.

Hope this clears it up. Happy to dig further if you have a repro showing different behavior than the above.

Topic		Replies	Views
Use ingress-nginx Resources in Traefik \| Traefik Labs Blog	3	86	February 19, 2026
Using kubernetesIngressNginx to support nginx ingress routes ingressClasses are not deployed Traefik v3 (latest) kubernetes-ingress	1	115	December 29, 2025
Migrating from Nginx controller to Traefik controller (EKS) Traefik v2 kubernetes-crd , kubernetes-ingress	0	502	September 3, 2023
Traefik 2.5.6 does not support the Ingress.v1 ingressClassName key? Traefik v2 kubernetes-ingress	4	3186	January 30, 2022
Ingress Controller routing to ingress resources with ingressClassName = traefik Traefik v2 kubernetes-ingress	0	874	September 27, 2021

Potential race condition when migrating ingress-nginx to traefik

Related topics