I've observed that enabling the gzip module while using a proxy causes the nginx CPU to peg @ 100%, but that doesn't appear to happen while connecting directly. This is a strange anomaly, and may be worth further investigation.
I've observed the speed discrepancies while in swarm mode as seen below (this is with the gzip module off):
# swarm mode - traefik proxy (:80) - gzip module disabled
➜ ~ curl --limit-rate 2G -o /dev/null http://localhost/test.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.0G 100 50.0G 0 0 940M 0 0:00:54 0:00:54 --:--:-- 1301M
# swarm mode - nginx port map (:8080) - gzip module disabled
➜ ~ curl --limit-rate 2G -o /dev/null http://localhost:8080/test.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.0G 100 50.0G 0 0 2040M 0 0:00:25 0:00:25 --:--:-- 2139M
With the gzip module enabled, I've observed the nginx worker using a lot of CPU:
systemd+ 9468 87.9 0.0 11412 3108 ? S 12:38 1:46 nginx: worker process
and constrained bandwidth compared to the non-proxy connection. this is peculiar because these requests don't have gzip enabled (this would require passing accept-encoding headers, which I'll do in a moment)
# swarm mode - traefik proxy (:80) - gzip module enabled
➜ ~ curl --limit-rate 2G -o /dev/null http://localhost/test.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.0G 0 50.0G 0 0 498M 0 --:--:-- 0:01:42 --:--:-- 499M
# swarm mode - nginx port map (:8080) - gzip module enabled
➜ ~ curl --limit-rate 2G -o /dev/null http://localhost:8080/test.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.0G 100 50.0G 0 0 2022M 0 0:00:25 0:00:25 --:--:-- 2108M
Let's take a look at the speeds when accepting gzip encoding:
# swarm mode - traefik proxy (:80) - gzip module enabled + encoding enabled
➜ test-img curl -H 'Accept-encoding: gzip' --limit-rate 2G -o /dev/null http://localhost/test-5g.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 22.3M 0 22.3M 0 0 2246k 0 --:--:-- 0:00:10 --:--:-- 2226k
# swarm mode - nginx port map (:8080) - gzip module enabled + encoding enabled
➜ test-img curl -H 'Accept-encoding: gzip' --limit-rate 2G -o /dev/null http://localhost:8080/test-5g.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 22.3M 0 22.3M 0 0 2308k 0 --:--:-- 0:00:09 --:--:-- 2309k
As you can see here, both downloads are seriously constrained while the nginx server is actually compressing the content. As to the reasons behind having the gzip module enabled and performance degradation, it'd be interesting to see if that occurs with other proxy servers, such as HA proxy.
One thing of note, which is why I believe this might be an issue directly related to how swarm handles networking, this issue doesn't occur when both services are running in host mode. I'm observing the opposite, where direct downloads from nginx are slower.
# host mode - traefik proxy - gzip module disabled
➜ test-00a curl --limit-rate 2G -o /dev/null http://localhost/test.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.0G 100 50.0G 0 0 1415M 0 0:00:36 0:00:36 --:--:-- 1423M
# host mode - nginx direct - gzip module disabled
➜ test-00a curl --limit-rate 2G -o /dev/null http://localhost:8080/test.img
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 50.0G 100 50.0G 0 0 1093M 0 0:00:46 0:00:46 --:--:-- 1235M
I am using a file provider to directly connect to the service exposed via network_mode: host
, which I've included below including the other configuration files.
swarm.yaml:
version: '3.7'
networks:
traefik:
external: true
services:
proxy:
image: traefik:latest
command:
- '--providers.docker=true'
- '--entryPoints.web.address=:80'
- '--providers.providersThrottleDuration=2s'
- '--providers.docker.watch=true'
- '--providers.docker.swarmMode=true'
- '--providers.docker.swarmModeRefreshSeconds=15s'
- '--providers.docker.exposedbydefault=false'
#- '--providers.file.filename=/etc/traefik/rules.toml'
- '--ping.entryPoint=web'
ports:
- 80:80
networks:
- traefik
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik/rules.toml:/etc/traefik/rules.toml
nginx:
image: nginx
networks:
- traefik
ports:
- 8080:8080
deploy:
labels:
- traefik.enable=true
- traefik.http.services.nginx.loadbalancer.server.port=8080
- traefik.http.routers.nginx.rule=Host(`localhost`)
- traefik.http.routers.nginx.service=nginx
- traefik.http.routers.nginx.entrypoints=web
- traefik.docker.network=traefik
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
- ./test-img:/usr/share/nginx/html
rules.toml:
[http.routers]
# Define a connection between requests and services
[http.routers.speedtest]
rule = "Host(`localhost`)"
entrypoints = ["web"]
service = "speedtest"
[http.services]
# Define how to reach an existing service on our infrastructure
[http.services.speedtest.loadBalancer]
[[http.services.speedtest.loadBalancer.servers]]
url = "http://127.0.0.1:8080"
docker-compose.yaml
version: '3.7'
services:
proxy:
image: traefik:latest
command:
- '--entryPoints.web.address=:80'
- '--providers.providersThrottleDuration=2s'
- '--providers.file.filename=/etc/traefik/rules.toml'
- '--ping.entryPoint=web'
network_mode: host
volumes:
- ./traefik/rules.toml:/etc/traefik/rules.toml
nginx:
image: nginx
network_mode: host
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
- ./test-img:/usr/share/nginx/html
nginx.conf:
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
#gzip_types application/octet-stream;
server {
listen 8080;
listen [::]:8080;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
}
I don't have an explanation for these observations, but I've passed them onto the developers to see if there is any further action we can take to better explain what is happening here.