We are running a private Harbor Registry and make it available for us to use via the Traefik Ingress Controller. But for some reasons we are getting timeouts when trying to pull images from harbor (no matter if podman, crictl or anything else). The only error we can see is from Traefik Debug Logs:
DBG github.com/traefik/traefik/v3/pkg/middlewares/recovery/recovery.go:45 > Request has been aborted [10.42.2.0:5760 - /v2/pr-rancher/rancher/rke2-runtime/blobs/sha256:0cb914462bbe526255276af0c931468b956c464e55c1f7bde3fb8b124d4c55f7?ns=registry.rancher.com]: net/http: abort Handler middlewareName=traefik-internal-recovery middlewareType=Recovery
We had to switch from contour to traefik a few weeks ago and since then this issue has persisted without a solution.
Traefik Version: v3.5.2
Traefik custom Helm values:
deployment:
kind: DaemonSet
gateway:
enabled: false
image:
registry: docker.io
repository: traefik
ingressClass:
enabled: true
isDefaultClass: true
name: traefik
metrics:
addInternals: false
prometheus:
entryPoint: metrics
service:
enabled: false
serviceMonitor:
enabled: false
ports:
tcp-30000:
expose:
default: true
exposedPort: 30000
port: 30000
protocol: TCP
udp-30001:
expose:
default: true
exposedPort: 30001
port: 30001
protocol: UDP
web:
transport:
respondingTimeouts:
idleTimeout: 300s
readTimeout: 300s
writeTimeout: 300s
websecure:
transport:
respondingTimeouts:
idleTimeout: 300s
readTimeout: 300s
writeTimeout: 300s
providers:
kubernetesGateway:
enabled: true
experimentalChannel: true
We have tried running the default goharbor ingress resource and an ingresroute now as well:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: goharbor-ingressroute
namespace: harbor
spec:
entryPoints:
- websecure
routes:
- match: Host(`xxx`) && PathPrefix(`/api/`)
kind: Rule
services:
- name: goharbor-core
port: 80
passHostHeader: true
serversTransport: harbor-transport
- match: Host(`xxx`) && PathPrefix(`/service/`)
kind: Rule
services:
- name: goharbor-core
port: 80
passHostHeader: true
serversTransport: harbor-transport
- match: Host(`xxx`) && PathPrefix(`/v2/`)
kind: Rule
services:
- name: goharbor-core
port: 80
passHostHeader: true
serversTransport: harbor-transport
- match: Host(`xxx`) && PathPrefix(`/c/`)
kind: Rule
services:
- name: goharbor-core
port: 80
passHostHeader: true
serversTransport: harbor-transport
- match: Host(`xxx`) && PathPrefix(`/`)
kind: Rule
services:
- name: goharbor-portal
port: 80
passHostHeader: true
tls:
secretName: xxx
---
apiVersion: traefik.io/v1alpha1
kind: ServersTransport
metadata:
name: harbor-transport
namespace: harbor
spec:
disableHTTP2: true
forwardingTimeouts:
dialTimeout: "30s"
responseHeaderTimeout: "0s"
idleConnTimeout: "0s"
Other than those configs everything is bare bones default config. And aside from the debug logs above there are no other error messages.
The current issue is that the image pull either succeeds with multiple connection timeouts in between the progress bar or worse, the image pull fails completely and we have to manually re-trigger them multiple times before it succeeds(which is especially problematic because this interrupts our automations)