502 blip when updating/restarting traefik deployment in Kubernetes

Hey!

Been using traefik for around 2 years now, does a great job.

We have had 1 minor issue for atleast a year now that we have looked the other way, but figured i would see if theres a thread or link i have missed or any debugging guidance?

When we restart the deployment or update the helm which rolls an update of the deployment we get a blip of 502 as the traefik pod you are routing through gets shut down.

We have tried running multiple replicas or single replicas (where it keeps the old one up until the new one is ready), we have added min ready time to the pod so that there is ~30 seconds of being alive before it even starts to try and kill the old pod just incase. but as soon as the old pod is removed the first refresh of an app will have cloudfront return a 502 and then the second refresh (talking about refreshing asap after the first fail) it will be fine again.

Not a huge issue just causes false positives to canaries and im sure users notice from time to time.

Below is any context info i can think of to help picture our setup

We have traefik installed via helm, running 34.1.0 of the helmchart, so traefik docker.io/traefik:v3.3.2, its running in AWS and flow is via

app domain (app.dev.DOMAIN) -> Cloudfront -> Traefik Load balancer DNS (alb.dev.DOMAIN) -> Kube Ingress -> traefik routers kick in here on app domain (app.dev.DOMAIN)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/backend-protocol: HTTPS
    alb.ingress.kubernetes.io/certificate-arn: >-
      arn:aws:acm:ap-region:12345:certificate/12345
    alb.ingress.kubernetes.io/group.name: traefik-alb-external
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443, "HTTP":80}]'
    alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=4000
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/shield-advanced-protection: 'true'
    alb.ingress.kubernetes.io/ssl-redirect: '443'
    alb.ingress.kubernetes.io/tags: environment=development
    alb.ingress.kubernetes.io/target-type: ip
    external-dns.alpha.kubernetes.io/hostname: alb.dev.DOMAIN
    external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only
    external-dns.alpha.kubernetes.io/type: public
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"networking.k8s.io/v1","kind":"Ingress","metadata":{"annotations":{"alb.ingress.kubernetes.io/backend-protocol":"HTTPS","alb.ingress.kubernetes.io/certificate-arn":"arn:aws:acm:ap-southeast-2:757724678808:certificate/8f963158-fdc5-478a-8099-67a85d4cc28e","alb.ingress.kubernetes.io/group.name":"traefik-alb-external","alb.ingress.kubernetes.io/listen-ports":"[{\"HTTPS\":443,
      \"HTTP\":80}]","alb.ingress.kubernetes.io/load-balancer-attributes":"idle_timeout.timeout_seconds=4000","alb.ingress.kubernetes.io/scheme":"internet-facing","alb.ingress.kubernetes.io/shield-advanced-protection":"true","alb.ingress.kubernetes.io/ssl-redirect":"443","alb.ingress.kubernetes.io/tags":"environment=development","alb.ingress.kubernetes.io/target-type":"ip","external-dns.alpha.kubernetes.io/hostname":"alb.dev.DOMAIN","external-dns.alpha.kubernetes.io/ingress-hostname-source":"annotation-only","external-dns.alpha.kubernetes.io/type":"public","kubernetes.io/ingress.class":"default"},"labels":{"argocd.argoproj.io/instance":"traefik-development"},"name":"traefik-external","namespace":"traefik"},"spec":{"rules":[{"http":{"paths":[{"backend":{"service":{"name":"traefik","port":{"name":"websecure"}}},"path":"/*","pathType":"ImplementationSpecific"}]}}]}}
    kubernetes.io/ingress.class: default
    meta.helm.sh/release-name: traefik
    meta.helm.sh/release-namespace: traefik
  creationTimestamp: '2024-02-13T06:57:01Z'
  finalizers:
    - group.ingress.k8s.aws/traefik-alb-external
  generation: 1
  labels:
    app.kubernetes.io/managed-by: Helm
    argocd.argoproj.io/instance: traefik-development
  name: traefik-external
  namespace: traefik
  resourceVersion: '584241109'
  uid: ea56d5d7-85fd-4884-96f5-7383e2c1740f
spec:
  rules:
    - http:
        paths:
          - backend:
              service:
                name: traefik
                port:
                  name: websecure
            path: /*
            pathType: ImplementationSpecific
status:
  loadBalancer:
    ingress:
      - hostname: >-
          k8s-traefikalbexterna-1234.ap-region.elb.amazonaws.com

incase it matters, we manually set which elb as we have two setup one directs traffic over to a subnet only accessible by our vpn and cluster subnets, and the above one which is used for public traffic

 - args:
            - '--api.insecure=true'
            - '--serverstransport.insecureskipverify=true'
            - '--entryPoints.metrics.address=:9100/tcp'
            - '--entryPoints.traefik.address=:8080/tcp'
            - '--entryPoints.web.address=:8000/tcp'
            - '--entryPoints.websecure.address=:8443/tcp'
            - '--api.dashboard=true'
            - '--ping=true'
            - '--metrics.prometheus=true'
            - '--metrics.prometheus.entrypoint=metrics'
            - '--providers.kubernetescrd'
            - '--providers.kubernetescrd.ingressClass=traefik'
            - '--providers.kubernetescrd.allowCrossNamespace=true'
            - '--providers.kubernetescrd.allowExternalNameServices=true'
            - '--providers.kubernetescrd.allowEmptyServices=false'
            - '--providers.kubernetesingress'
            - '--providers.kubernetesingress.allowExternalNameServices=true'
            - '--providers.kubernetesingress.allowEmptyServices=false'
            - >-
              --providers.kubernetesingress.ingressendpoint.publishedservice=traefik/traefik
            - '--providers.kubernetesingress.ingressClass=traefik'
            - '--entryPoints.websecure.http.tls=true'
            - '--log.level=ERROR'
            - '--accesslog=true'
            - '--accesslog.fields.defaultmode=keep'
            - '--accesslog.fields.headers.defaultmode=drop'
            - '--providers.file.filename=/config/dynamic.yaml'
            - >-
              --providers.kubernetesingress.ingressEndpoint.hostname=alb.dev.DOMAIN
dynamic.yaml: |
    http:
      routers:
        catchall:
          # attached only to web entryPoint
          entryPoints:
            - 'websecure'
          # catchall rule
          rule: 'PathPrefix(`/`)'
          service: unavailable
          # lowest possible priority
          # evaluated when no other router is matched
          priority: 1
      services:
        # Service that will always answer a 503 Service Unavailable response
        unavailable:
          loadBalancer:
            servers: {}

any help would be appreciated.

Thanks!

i have found this Bad Gateway during Helm Upgrade/Rollout

But we dont use nginx, as we use alb