Unable to connect Traefik to container in adjacent docker swarm

Because there are a number of files required to reproduce this issue, I put together a simple example that, on my Mac, reproduces the problem I will describe here. The repo is here:

As far as I am aware, I tagged the routes I was trying to make work, were created identically. The two routes in the Swarm stack that the Traefik container resides in work fine, the route in the adjacent portainer stack failed.

Configuration for traefik stack

version: "3.3"

services:

  traefik:
    image: traefik:2.0.1
    command:
      - --log.level=DEBUG
      - --api.insecure=true
      - --providers.docker=true
      - --providers.docker.exposedbydefault=false
      - --entrypoints.web.address=:80
      - --providers.file.directory=/traefik-config
      - --providers.file.watch=true
      - --api.debug=true
      - --log.filepath=/traefik.log
      - --accesslog.filepath=/traefik.access.log
      - --log.format=json
    ports:
      - "80:80"
      - "8199:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ${PWD}/volumes/traefik:/traefik
      - ${PWD}/traefik.log:/traefik.log
      - ${PWD}/traefik.access.log:/traefik.access.log
    networks:
        - skylab-overlay      

  influxdb:
      image: influxdb:1.5.2
      restart: always
      volumes:
          - ${PWD}/volumes/influxdb:/var/lib/influxdb
      networks:
          - skylab-overlay
      labels:
          - "traefik.enable=true"
          - "traefik.http.routers.influxdb.rule=Host(`localhost`) && PathPrefix(`/influxdb`)"
          - "traefik.http.routers.influxdb.entrypoints=web"
          - "traefik.http.services.influxdb.loadbalancer.server.port=8086"
          - "traefik.http.middlewares.influxdb-strip-prefix.stripPrefix.prefixes=/influxdb"
          - "traefik.http.routers.influxdb.middlewares=influxdb-strip-prefix@docker"
          
  grafana:
      image: grafana/grafana:5.4.3
      environment:
          - GF_AUTH_ANONYMOUS_ENABLED
          - GF_SERVER_ROOT_URL
          # Grafana - overrides the grafana.ini file
          #   See http://docs.grafana.org/installation/configuration/#using-environment-variables
          #   See also http://docs.grafana.org/installation/docker/#configuration
          - GF_AUTH_ANONYMOUS_ENABLED=${GF_AUTH_ANONYMOUS_ENABLED:-true}
          - GF_SERVER_ROOT_URL=${GF_SERVER_ROOT_URL:-%(protocol)s://%(domain)s:%(http_port)s/grafana/}
      restart: always
      volumes:
          - ${PWD}/volumes/grafana:/var/lib/grafana
      networks:
          - skylab-overlay
      labels:
          - "traefik.enable=true"
          - "traefik.http.routers.grafana.rule=Host(`localhost`) && PathPrefix(`/grafana`)"
          - "traefik.http.routers.grafana.entrypoints=web"
          - "traefik.http.services.grafana.loadbalancer.server.port=3000"
          - "traefik.http.middlewares.grafana-strip-prefix.stripPrefix.prefixes=/grafana"
          - "traefik.http.routers.grafana.middlewares=grafana-strip-prefix@docker"

networks:
   skylab-overlay:
        # Note: external networks are expected to exist prior to bringing this configuration up.
        # Use: docker network create  -d overlay --attachable skylab-overlay
        external: true

Configuration for portainer stack

version: '3.2'
services:
  portainer:
    image: portainer/portainer:1.22.0
    command: -H tcp://tasks.agent:9001 --tlsskipverify 
    ports:
    - "9000:9000"
    - "8000:8000"
    networks:
    - agent_network
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [node.role == manager]
    labels:
        - "traefik.enable=true"
        - "traefik.http.routers.portainer.rule=Host(`localhost`) && PathPrefix(`/portainer`)"
        - "traefik.http.routers.portainer.entrypoints=web"
        - "traefik.http.services.portainer.loadbalancer.server.port=9000"
        - "traefik.http.middlewares.portainer-strip-prefix.stripPrefix.prefixes=/portainer"
        - "traefik.http.routers.portainer.middlewares=portainer-strip-prefix@docker"
        - "traefik.docker.network=agent_network"

  agent:
    image: portainer/agent:1.4.0
    environment:
      # REQUIRED: Should be equal to the service name prefixed by "tasks." when
      # deployed inside an overlay network
      AGENT_CLUSTER_ADDR: tasks.agent
      # AGENT_PORT: 9001
      # LOG_LEVEL: debug
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/lib/docker/volumes:/var/lib/docker/volumes
    networks:
      - agent_network
    deploy:
      mode: global
      placement:
        constraints: [node.platform.os == linux]
networks:
  agent_network:
    driver: overlay
    attachable: true

When attempting to connect to http://localhost/portainer the gateway times out. Connecting to http://localhost/influxdb or http://localhost/grafana work as expected (they are in the same swarm stack as the traefik container)

This is what it looks like from the web UI

Here are the service details.

Hello,

can you explain how you route traffic between the network the traefik is on and the network the portainer is on? I'm not a swarm expert, but from what I understand those are isolated, and by default you could not expect one to be reachable from the other.

In any case to prove that this is not the traefik problem you can exec to traefik container and try to reach portainer from there (e.g. with curl). If you cannot, than it's understandable that traefik cannot either.

That test would be easier to invoke if curl were shipped in the traefik image. It is not. Ping however is, and it was sufficient for your proposed test.

You are correct, the container I am attempting to connect to is not visible from the traefik container.

ping: bad address portainer```

I tried attaching portainer to the same overlay network that traefik was on and that allowed ping to see it however portainer + the portainer agent have their own networking requirements that that change in configuration broke. 

The use case I was envisioning was having a single routing point that could vector to multiple swarm stacks. Unless I deploy them on a single overlay network, I don't believe that is possible. I will play around with alternate configurations to achieve my own requirements. 

Thanks for your help and insight.

Ususally it's trivial installing curl, it's one command and a few seconds if your container has internet connection. And even if it does not you can just host the binary somewhere you container has access to.

Usually people prefer to have smaller containers vs large one with "potentially useful tools", and of course if one would like to create an image based on traefik and add tooling that they require in it, they can do so.

I'm glad that ping helped you, however just today I wrote about the ping the following post, so your milage may wary.

Ususally it's trivial installing curl, it's one command and a few seconds if your container has internet connection.

In our case we do not have direct access to an internet connection. This is not unusual in an enterprise setting.

I performed my original testing by spawning a Tomcat container alongside Traefik in the same stack. I then invoked curl from there making the transitive assumption of network equivalence between the two containers. That was quicker that digging into which Unix distro your container was based upon and figuring out how to install curl without the internet.

I did notice (this morning) that you do ship nc on the container so perhaps that is what you should recommend for these types of connectivity diagnostics. It is a lower level tool than curl but it will test TCP connectivity to the port with a TCP connection (as opposed to ICMP with ping).

docker exec -it  dd0a193c1a5e nc -zv influxdb 8086
influxdb (10.0.0.59:8086) open

No it's not. I personally (a traefik user, same as you, not affiliated with Contaious) set up a minio server for blob hosting, where I host everything that infrastructure might need access to inside the corporate network. Curling or Wgetting stuff from there then is a breeze.

Given that Dockerfile links are published at the very top of the corresponding Docker hub page, I did not find too harrowing, clicking on them and reading the first line that specify distribution. Your experience may vary.

I don't have the statistics but I have a feeling that alpine is prevalent container base OS. And by the time I started work with traefik I was already familiar with it, because it was used for many other containers I was using. Thus I did not find it difficult or not intuitive installing curl there. Well, I guess those things come with experience.

Good catch about nc. I'm not sure, but I have a feeling that nc is only available as part of busybox, not on it's own. And alpine is busybox based. You can type busybox at the command line to see what other commands are available or read busybox documentation if you are interested enough.

nc is definitely a very good troubleshooting advice for those who cannot quickly apk add -U curl because they are not connected to internet.

Speaking of wget, following your advice, I ran the busybox command to learn what is available, wget is one of the binaries preinstalled on the container, immediately available for triaging connection issues.

It is! I just prefer curl :wink: