Hello everyone,
as suggested / requested multiple times by the @traefik
account and @patricia_dugan on Twitter, I am now taking this topic to the community forums.
First of all I must say that I have now migrated to v2, but it took me quite some time and this post will focus on the reasons for that. This is meant as constructive and functional feedback to the developers and can also serve as examples for the community. I will list positive and negative aspects of my migration story. Let's start with some background on my setup.
My environment
I am using traefik as a Docker service pinned to individual Docker nodes in a Docker Swarm cluster. Meaning, I am not using the ingress or loadbalancing features, because I want specific sites to only be reachable on specific nodes / IP addresses and therefore I also do not run traefik with Swarm awareness. So my 4 traefik instances are basically single instances observing and handling traffic for a single Docker node each. Before you ask: the Docker Swarm cluster is used for other purposes than loadbalancing the web traffic, like service management via Docker Stck. Swarm aside, I think my deployment scenario is pretty straight forward and similar to a single traefik instance serving a single Docker node, just 4 time repeated.
My deployment story
I deploy all my Docker nodes, including web application / services and of course traefik with Ansible on the Swarm cluster via Docker Stack definitions. This means I can use Ansible Jinja2 templating to generate the required YAML configuration files for traefik and also the Docker Stack definitions. Besides the general traefik configuration being done via CLI parameters, all web application / services are served with traefik via the Docker provider based on container (not Swarm service) labels.
My migration story
The original Docker Stack definition for my traefik v1 stack basically looked like this in the Ansible deployment task:
- name: deploy traefik stack
command: docker stack deploy --prune --compose-file - traefik-{{ inventory_hostname_short }}
args:
stdin: |
version: '3.7'
services:
traefik:
image: traefik:1.7.10
init: true
command:
- "--api"
- "--api.entryPoint=traefik"
- "--docker"
- "--docker.domain=example.com"
- "--docker.exposedByDefault=false"
- "--docker.templateVersion=2"
- "--docker.filename=/etc/traefik/docker.tmpl"
- "--docker.watch=true"
- "--acme"
- "--acme.email=..."
- "--acme.entryPoint=https"
- "--acme.storage=/etc/traefik/acme/acme.json"
- "--acme.onHostRule=true"
- "--acme.onDemand=false"
- "--acme.httpChallenge.entryPoint=http"
- "--entryPoints=Name:http Address::80 Redirect.EntryPoint:https"
- "--entryPoints=Name:https Address::443 TLS"
- "--entryPoints=Name:traefik Address::8080"
- "--defaultentrypoints=http,https"
- "--accessLog"
- "--accessLog.filePath=/var/log/traefik/access.log"
- "--metrics.prometheus"
- "--metrics.prometheus.entryPoint=traefik"
- "--tracing.jaeger.samplingServerURL=http://agent:5778/sampling"
- "--tracing.jaeger.localAgentHostPort=agent:6831"
logging:
driver: "journald"
labels:
traefik.enable: "true"
traefik.backend: "traefik"
traefik.frontend.rule: "Host:{{ inventory_hostname }}"
traefik.frontend.auth.basic: "..."
traefik.port: "8080"
traefik.docker.network: "traefik-{{ inventory_hostname_short }}_dmz"
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 443
published: 443
protocol: tcp
mode: host
networks:
- dmz
environment:
DOCKER_TEMPLATE_CHECKSUM: "{{ docker_template.checksum }}"
volumes:
- /home/docker-traefik/docker.tmpl:/etc/traefik/docker.tmpl
- /home/docker-traefik/acme:/etc/traefik/acme
- /home/docker-traefik/logs:/var/log/traefik
- /var/run/docker.sock:/var/run/docker.sock
- /dev/null:/traefik.toml
deploy:
endpoint_mode: dnsrr
placement:
constraints:
- node.hostname == {{ inventory_hostname }}
networks:
dmz:
driver: overlay
attachable: true
ipam:
driver: default
config:
- subnet: "10.8.{{ groups.docker.index(inventory_hostname) }}.0/24"
delegate_to: "{{ groups['docker_swarm_manager'][0] }}"
delegate_facts: True
become: yes
tags: docker-traefik
As you can see this was a pretty straightforward setup with the following traefik features being used:
- Docker provider (with custom template to generate static backend names)
- Certificates via Let's Encrypt (ACME)
- Redirect from HTTP to HTTPS for all containers
- Default EntryPoints for HTTP and HTTPS
- EntryPoint traefik to be used for Dashboard / API interface (served with authentication via container labels)
- Access logging, Prometheus metrics and Jeager tracing
Now to convert this for use with traefik v2, it now looks like the following:
- name: deploy traefik stack
command: docker stack deploy --prune --compose-file - traefik-{{ inventory_hostname_short }}
args:
stdin: |
version: '3.7'
services:
traefik:
image: traefik:2.0
init: true
command:
- "--api"
- "--api.insecure=true"
- "--api.dashboard=true"
- "--providers.docker"
- "--providers.docker.exposedByDefault=false"
#- "--docker.templateVersion=2"
#- "--docker.filename=/etc/traefik/docker.tmpl"
- "--providers.docker.watch"
- "--providers.file"
- "--providers.file.filename=/etc/traefik/file.yaml"
- "--providers.file.watch"
#- "--acme"
- "--certificatesResolvers.letsencrypt.acme.email=..."
#- "--acme.entryPoint=https"
- "--certificatesResolvers.letsencrypt.acme.storage=/etc/traefik/acme/acme.json"
- "--certificatesResolvers.letsencrypt.acme.tlsChallenge=true"
#- "--acme.onHostRule=true"
#- "--acme.onDemand=false"
#- "--acme.httpChallenge.entryPoint=http"
- "--entryPoints.http"
- "--entryPoints.http.address=:80"
#- "-- Redirect.EntryPoint:https"
- "--entryPoints.https"
- "--entryPoints.https.address=:443"
#- "-- TLS"
- "--entryPoints.traefik.address=:8080"
#- "--defaultentrypoints=http,https"
- "--accessLog"
- "--accessLog.filePath=/var/log/traefik/access.log"
- "--metrics.prometheus=true"
- "--metrics.prometheus.entryPoint=traefik"
- "--tracing.jaeger=true"
- "--tracing.jaeger.samplingServerURL=http://agent:5778/sampling"
- "--tracing.jaeger.localAgentHostPort=agent:6831"
logging:
driver: "journald"
labels:
traefik.enable: "true"
traefik.docker.network: "traefik-{{ inventory_hostname_short }}_dmz"
traefik.http.routers.traefik-https.rule: &rule "Host(`{{ inventory_hostname }}`)"
traefik.http.routers.traefik-https.entrypoints: "https"
traefik.http.routers.traefik-https.middlewares: "auth@file"
traefik.http.routers.traefik-https.tls: "true"
traefik.http.routers.traefik-https.tls.certResolver: "letsencrypt"
traefik.http.routers.traefik-http.rule: *rule
traefik.http.routers.traefik-http.entrypoints: "http"
traefik.http.routers.traefik-http.middlewares: "https@file"
traefik.http.services.traefik.loadbalancer.server.port: "8080"
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 443
published: 443
protocol: tcp
mode: host
networks:
- dmz
environment:
FILE_CONFIG_CHECKSUM: "{{ file_config.checksum }}"
volumes:
#- /home/docker-traefik/docker.tmpl:/etc/traefik/docker.tmpl
- /home/docker-traefik/file.yaml:/etc/traefik/file.yaml
- /home/docker-traefik/acme:/etc/traefik/acme
- /home/docker-traefik/logs:/var/log/traefik
- /var/run/docker.sock:/var/run/docker.sock
#- /dev/null:/traefik.toml
deploy:
endpoint_mode: dnsrr
placement:
constraints:
- node.hostname == {{ inventory_hostname }}
networks:
dmz:
driver: overlay
attachable: true
ipam:
driver: default
config:
- subnet: "10.8.{{ groups.docker.index(inventory_hostname) }}.0/24"
delegate_to: "{{ groups['docker_swarm_manager'][0] }}"
delegate_facts: True
become: yes
tags: docker-traefik
As you can see I was able to keep most of the general configuration in the CLI parameters.
- Docker provider (no need for a custom template, because backend (service) names are now static by design via container labels)
- (New) File provider for reusable middlewares, e.g. for authentication and redirects
- Certificates (still) via Let's Encrypt (ACME, now via TLS challenge), but unfortunately onHostRule is gone (this will lead to some serious redundant container labels)
- EntryPoints for HTTP and HTTPS (the "Default" part is gone )
- EntryPoint traefik to be used for Dashboard / API interface (served with authentication via container labels) (still there )
- Access logging, Prometheus metrics and Jeager tracing (still there )
IMHO completely missing are the following features:
- Being able to set sensible defaults for EntryPoints, ACME and TLS configuration (see above)
- Redirect from HTTP to HTTPS for all containers
Of course traefik v2 still provides some new stuff, like the following:
- Static middleware, router and service names (see above, although this is a side effect of everything needing to be configured explicitly instead of general defaults being applied / inherited)
- More flexibility by being able to configure the above for each container individually, especially HTTPS redirection and TLS configuration
- More flexibility by allowing for more complex rules on routers via new syntax (while breaking the existing host rule syntax)
- Serving and routing TCP traefik including TLS
This means I think I totally understand why the shift was from having the default EntryPoints, HTTPS redirection, ACME and TLS configuration on the general level to the router level. The only thing I am really missing here is: why not have the best of both worlds? Being able to specify sensible defaults for EntryPoints, HTTPS redirection, ACME and TLS configuration on the general level, while still being able to override the defaults on a per router basis?
You can already imagine the potential configuration overhead (duplication instead of DRY) this change has for me by looking at the traefik container labels on the 2 traefik containers.
Comparison of v1 and v2 container labels
Example of serving a buildbot instance which is providing different ports for different access levels:
Before (with v1.7):
labels:
traefik.enable: "true"
traefik.user.backend: "buildbot-user"
traefik.user.frontend.rule: "Host:buildbot.example.com;Method:GET,HEAD"
traefik.user.port: "8080"
traefik.github.backend: "buildbot-github"
traefik.github.frontend.rule: "Host:buildbot.example.com;Method:POST;Path:/change_hook/github"
traefik.github.port: "8080"
traefik.admin.backend: "buildbot-admin"
traefik.admin.frontend.rule: "Host:buildbot.example.com;PathPrefixStrip:/admin/"
traefik.admin.frontend.auth.basic: "..."
traefik.admin.port: "8443"
traefik.docker.network: "traefik-{{ inventory_hostname_short }}_dmz"
After (with v2.0):
labels:
traefik.enable: "true"
traefik.docker.network: "traefik-{{ inventory_hostname_short }}_dmz"
traefik.http.middlewares.buildbot-admin.stripprefix.prefixes: "/admin"
traefik.http.routers.buildbot-user-https.rule: &rule "Host(`buildbot.example.com`) && (Method(`GET`,`HEAD`) || (Method(`POST`) && Path(`/change_hook/github`)))"
traefik.http.routers.buildbot-user-https.entrypoints: "https"
traefik.http.routers.buildbot-user-https.service: &service "buildbot-user@docker"
traefik.http.routers.buildbot-user-https.tls: "true"
traefik.http.routers.buildbot-user-https.tls.certResolver: "letsencrypt"
traefik.http.routers.buildbot-user-http.rule: *rule
traefik.http.routers.buildbot-user-http.entrypoints: "http"
traefik.http.routers.buildbot-user-http.middlewares: "https@file"
traefik.http.routers.buildbot-user-http.service: *service
traefik.http.routers.buildbot-admin-https.rule: &rule "Host(`buildbot.example.com`) && PathPrefix(`/admin/`)"
traefik.http.routers.buildbot-admin-https.priority: 200
traefik.http.routers.buildbot-admin-https.entrypoints: "https"
traefik.http.routers.buildbot-admin-https.middlewares: "auth@file,buildbot-admin@docker"
traefik.http.routers.buildbot-admin-https.service: &service "buildbot-admin@docker"
traefik.http.routers.buildbot-admin-https.tls: "true"
traefik.http.routers.buildbot-admin-https.tls.certResolver: "letsencrypt"
traefik.http.routers.buildbot-admin-http.rule: *rule
traefik.http.routers.buildbot-admin-http.entrypoints: "http"
traefik.http.routers.buildbot-admin-http.middlewares: "https@file"
traefik.http.routers.buildbot-admin-http.service: *service
traefik.http.services.buildbot-user.loadbalancer.server.port: "8080"
traefik.http.services.buildbot-admin.loadbalancer.server.port: "8443"
As you can see I am already making use of YAML anchors and aliases for DRY (thanks to @sudo_bmitch
on Twitter), but the number of labels still went up from 11 to 25. And this is after I already merged the "github" backend into the "user" service by using a more complex router rule. I am also missing the simplicity and efficiency of being able to match something and modify it at the same time, e.g. the rule matcher and modifier PathStripPrefix.
My conclusion
I would really like to ask and urge the developers of traefik v2 to make it possible again to specify some defaults like HTTPS, ACME, TLS for routers on a general level to be inherited unless overridden. Having 2 routers for every service, just because you want TLS feels really redundant, especially if you have to specify the certificateResolver over and over while only one is configured (e.g. Let's Encrypt). Having to specify a middleware on every HTTP router to redirect to HTTPS feels again really redundant.
I am looking forward to the discussion and feedback around the aspects I brought up. I hope that I have made my points clear. In case I missed something or did something wrong and you have an idea on how to fix it or make it even less verbose and apply DRY, please feel free to comment here. Thanks in advance!
Best regards,
Marc
P.S.: The stupid rule of prohibiting new users from posting no more than 2 links in a post just made me remove all the references I collected for this. This means it's @traefik fault that this post has no references attached to it.