We run several docker-swarm clusters with a configuration roughly as follows:
We deploy a traefik container configured
services:
proxy:
image: ${DOCKER_REGISTRY_URL}/traefik:v1.7.11-alpine
deploy:
mode: global
networks:
- backend
ports:
- target: 80
published: 80
protocol: tcp
mode: host
- target: 8080
published: 8080
protocol: tcp
mode: host
- target: 8091
published: 8091
protocol: tcp
mode: host
- target: 8079
published: 8079
protocol: tcp
mode: host
command:
- --configFile=/etc/traefik/traefik.toml
configs:
- source: toml
target: /etc/traefik/traefik.toml
secrets:
...
configs:
toml:
name: toml-${CONFIG_SHA}
file: ./traefik.toml
labels:
com.docker.ucp.access.label: /Shared
On these clusters we deploy several services that look something like this:
services:
myApp:
image: ${DOCKER_REGISTRY_URL}/myApplication:${TAG}
deploy:
mode: global
restart_policy:
condition: on-failure
update_config:
parallelism: 1
delay: 10s
labels:
traefik.enable: "true"
traefik.docker.network: "backend"
traefik.alias.frontend.rule: "Host:firstdns.alias,seconddns.alias"
traefik.extern.frontend.rule: "Host:proxy.alias; PathPrefix:/myApp"
traefik.port: "8080"
traefik.frontend.passHostHeader: "true"
labels:
com.docker.ucp.access.label: /Shared
360+ days out of the year, this runs perfectly fine. Then, all of a sudden (this happens maybe once a month spread across 12 of these clusters), traefik will forget the rules for one of these services. Calls according to these rules are no longer redirected and result in a 404, traefik dashboard doesn't show the rule, basically, traefik behaves as though the labels are missing from the service (even though, when checked directly, these containers still show these labels) . Meanwhile, the traefik-rules for the other services on the cluster are still being followed as usual.
While this does seem to happen more often when the service is restarted or updated, we have seen it happen completely out of the blue, without any container having been (re)started.
Is anyone familiar with similar issues? Could this be a result of configuration-errors, or is this a (probable) bug?