Hey,
we are running traefik v2 2.5.4 (Chart version 10.6.2) on our kubernetes cluster as ingress. We use custom CRDs to define IngressRoutes. We noticed that our docker pushes to a registry (harbor) behind traefik were really slow so we ran a few tests that tested pure web traffic in a few different scenarios to isolate the problem. For the tests we used whoami/bench as a server and wrk as a client.
Problem description and setup
Our traefik configuration has the following overrides:
deployment:
  initContainers:
  - command:
    - sh
    - -c
    - chmod -Rv 600 /data/*
    image: busybox:1.31.1
    name: volume-permissions
    volumeMounts:
    - mountPath: /data
      name: data
env:
- name: CF_DNS_API_TOKEN
  valueFrom:
    secretKeyRef:
      key: token
      name: cloudflare-apitoken
globalArguments:
- --certificatesresolvers.le.acme.caserver=https://acme-v02.api.letsencrypt.org/directory
- --certificatesresolvers.le.acme.dnschallenge=true
- --certificatesresolvers.le.acme.dnschallenge.provider=cloudflare
- --certificatesresolvers.le.acme.storage=/data/acme.json
- --certificatesresolvers.le.acme.dnschallenge.resolvers=1.1.1.1:53,8.8.8.8:53
ingressRoute:
  dashboard:
    enabled: false
logs:
  general:
    level: INFO
persistence:
  enabled: true
  storageClass: longhorn
ports:
  web-int:
    expose: false
    port: 9080
    protocol: TCP
  websecure-int:
    expose: false
    port: 9443
    protocol: TCP
    tls:
      certResolver: ""
      domains: []
      enabled: false
      options: ""
providers:
  kubernetesCRD:
    allowCrossNamespace: true
service:
  externalIPs:
  - X.X.X.X
Nothing big except adding lets encrypt, although the following tests were performed without tls to rule that out. Also we testes the network between the agents that host the pods and it was no bottleneck. For all the tests we made sure that the server and client pod ran on differrent nodes.
The three pods involved in the tests are:
The wrk pod which is the client doing the tests
apiVersion: v1
kind: Pod
metadata:
  name: potts
  namespace: test
spec:
  containers:
    - name: web
      image: skandyla/wrk
      command:
        - sleep 
        - "400000000"
The server whoami/bench pod which handles the incoming requests
apiVersion: v1
kind: Pod
metadata:
  name: whoami
  namespace: test
  labels:
    role: whoami
spec:
  containers:
    - name: whoami
      image: containous/whoami
      ports:
        - name: web
          containerPort: 80
          protocol: TCP
A service which can expose the whoami server:
apiVersion: v1
kind: Service
metadata:
  name: my-whomi
  namespace: test
spec:
  selector:
    role: whoami
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
Tests
Baseline
To have a baseline we let the two pods talk directly to each other, not using a service or traefik but their internal cluster IP. This is as fast is it could theoretically get:
wrk -t20 -c1000 -d60s -H "Host: doesnt.matter" --latency  http://(internal pod IP of whoami):80/bench
Running 1m test @ http://10.12.162.41:80/bench
  20 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.36ms    6.20ms 237.66ms   98.87%
    Req/Sec     9.71k     1.53k   28.32k    82.03%
  Latency Distribution
     50%    4.89ms
     75%    5.50ms
     90%    6.22ms
     99%   12.08ms
  8865524 requests in 1.00m, 1.04GB read
  Socket errors: connect 0, read 2869, write 1715713, timeout 0
Requests/sec: 147525.20
Transfer/sec:     17.73MB
We ran the test a few times and got values of around 16-17 MB/s every time.
Using a cluster service
This is more of an additional test. For this test we used the cluster internal DNS (over a service name). The same service that will be used by the ingress route in the traefik test.
wrk -t20 -c1000 -d60s -H "Host: doesnt.matter" --latency  http:/(whoami-service):80/bench
Running 1m test @ http://my-whoami.test:80/bench
  20 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.69ms    7.58ms 872.81ms   99.12%
    Req/Sec     9.29k     1.47k   47.03k    88.16%
  Latency Distribution
     50%    5.19ms
     75%    5.71ms
     90%    6.49ms
     99%   12.43ms
  6960258 requests in 1.00m, 836.37MB read
  Socket errors: connect 0, read 3643, write 177908, timeout 0
Requests/sec: 115845.62
Transfer/sec:     13.92MB
There was decline in performance with values around 13MB/s in multiple tests.
Using traefik
We added an IngressRoute that goes to that service and used the the IP of the control plane so the traffic would be routed over traefik.
wrk -t20 -c1000 -d60s -H "Host: whoami.test" --latency  http://(IP of cotnrol plane)0:80/bench
Running 1m test @ http://whoami.test:80/bench
  20 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    72.13ms   46.64ms 269.60ms   63.64%
    Req/Sec   709.05     84.82     1.59k    70.17%
  Latency Distribution
     50%   72.93ms
     75%  104.90ms
     90%  136.15ms
     99%  175.82ms
  847338 requests in 1.00m, 82.42MB read
Requests/sec:  14104.06
Transfer/sec:      1.37MB
We did expect a drop in performance but the drop looks like 10x when the traffic goes through traefik.
Conclusion
So we are not sure at the moment how to tackle this. Is this expected or did we misconfigure something or would need additional configuration to make this flow fast? Any ideas are welcome 
Cheers