Traefik v2 no access to service

I have a docker swarm running with one manager and two work nodes. Now have a very strange situation. Here is my traefik config

version: "3.7"

networks:
  cluster_network:
    driver: overlay
    ipam:
      driver: default
      config:
        - subnet: 192.168.99.0/24
services:
  traefik:
    image: traefik:v2.3.6
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [ node.role == manager ]
      labels:
        - traefik.enable=true
        - traefik.http.routers.traefik_http.rule=Host(`dev.xxx.com`)
        - traefik.http.routers.traefik_http.service=api@internal
        - traefik.http.routers.traefik_http.tls.certresolver=letsencryptresolver
        - traefik.http.routers.traefik_http.tls=true
        - traefik.http.routers.traefik_http.entrypoints=webs
        - traefik.http.routers.traefik_http.middlewares=auth
        - traefik.http.services.admin.loadbalancer.server.port=8080
        - traefik.http.middlewares.auth.basicauth.removeheader=true
        - traefik.http.middlewares.auth.basicauth.users=admin:xxx
        - traefik.docker.network=ks_cluster_network
    command: >
      --providers.docker
      --providers.docker.endpoint=unix:///var/run/docker.sock
      --providers.docker.exposedbydefault=false
      --providers.docker.swarmmode=true
      --providers.docker.network=ks_cluster_network
      --entryPoints.web.address=:80
      --entryPoints.webs.address=:443
      --accesslog
      --log.level=DEBUG
      --api=true
      --tracing=false
      --api.dashboard=true
      --api.insecure=true
      --tracing.serviceName=admin
      --serverstransport.insecureskipverify=true
      --certificatesresolvers.letsencryptresolver.acme.httpchallenge=true
      --certificatesresolvers.letsencryptresolver.acme.httpchallenge.entryPoint=web
      --certificatesresolvers.letsencryptresolver.acme.email=admin@xxx.com
      --certificatesresolvers.letsencryptresolver.acme.storage=/letsencrypt/acme.json
    ports:
      - "80:80"
      - "443:443"
   volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - acme:/letsencrypt

Then I run traefik as a separate stack with docker stack deploy -c ./traefik.yml ks.

Then I create another stack for services.


networks:
  ks_cluster_network:
    external: true

services:
  whoami:
    image: containous/whoami:latest
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [ node.hostname == node1 ]
      labels:
        - traefik.enable=true
        - traefik.backend=whoami
        - traefik.http.routers.whoami.rule=Host(`whoami.xxx.com`)
        - traefik.http.routers.whoami.entrypoints=webs
        - traefik.http.routers.whoami.tls.certresolver=letsencryptresolver
        - traefik.http.routers.whoami.tls=true
        - traefik.docker.network=ks_cluster_network
        - traefik.http.services.whoami.loadbalancer.server.port=80
    networks:
      - ks_cluster_network
  erp:
    image: monogramm/docker-dolibarr
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [ node.hostname == node1 ]
      labels:
        - traefik.enable=true
        - traefik.backend=erp
        - traefik.http.routers.erp.rule=Host(`erp.xxx.com`)
        - traefik.http.routers.erp.entrypoints=webs
        - traefik.http.routers.erp.tls.certresolver=letsencryptresolver
        - traefik.http.routers.erp.tls=true
        - traefik.docker.network=ks_cluster_network
        - traefik.http.services.erp.loadbalancer.server.port=80
    networks:
      - ks_cluster_network

whoami service running perfectly ok with "https://whoami.xxx.com". But erp "https://erp.xxx.com" doesn't work.
Using curl from traefik, it shows (192.168.99.13 is the ip of the erp container):

docker exec -it 95bb curl -v http://192.168.99.13:80                                                                        
*   Trying 192.168.99.13:80...
* TCP_NODELAY set
* Connected to 192.168.99.13 (192.168.99.13) port 80 (#0)
> GET / HTTP/1.1
> Host: 192.168.99.13
> User-Agent: curl/7.67.0
> Accept: */*
> 
after a while
> 
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

But network is reachable.

docker exec -it 95bb ping 192.168.99.13                                                                       
PING 192.168.99.13:80 (192.168.99.13): 56 data bytes
64 bytes from 192.168.99.13: seq=0 ttl=64 time=70.196 ms
64 bytes from 192.168.99.13: seq=1 ttl=64 time=31.833 ms
64 bytes from 192.168.99.13: seq=2 ttl=64 time=30.375 ms
64 bytes from 192.168.99.13: seq=3 ttl=64 time=69.206 ms

Traefik log related to erp is only getting a close request after i terminate:

"GET / HTTP/2.0" 499 21 "-" "-" 1229 "erp@docker" "http://192.168.99.13:80" 36890ms

But if I run erp on node2, then erp works perfectly fine. In summary,

  1. traefik dashboard shows everything fine;
  2. if erp is deploy in node1 then erp stuck no error message, but can't access
  3. deploy in node2 then works fine.

i did clean out all iptable rules and restart docker, also recreate docker swarm cluster. Still the same situation.

What could be wrong?

Thanks,

Hi,

  1. traefik dashboard shows everything fine

Missing ports 8080:8080 for dashboard?
I would add both networks to traefik service. I like to split a network for traefik itself an other for services.

2.traefik.backend=erp

Dont mix traefik 2.x with 1.7 ! (remove backend)

  1. if erp is deploy in node1 then erp stuck no error message, but can't access

Problem with labels for same node (whoami + erp). Both are using port 80 and you are not defining the services. Try this...

  • service whoami -->. - "traefik.http.routers.whoami.service=whoami"
  • service erp --> - "traefik.http.routers.erp.service=erp"
  1. deploy in node2 then works fine.
    No conflict because there are in different nodes workers
  • node 1 -->. whoami
  • node 2 -->. erp

Best
Diego

Hi Diego,

Thanks for you reply. I did change accordingly as following:

labels:
        - "traefik.enable=true"

        - traefik.http.routers.erp-http.rule=Host(`erp.xxx.com`)
        - traefik.http.routers.erp-http.entrypoints=http
        - traefik.http.middlewares.erp-http-redirect.redirectscheme.scheme=https
        - traefik.http.routers.erp-http.middlewares=erp-http-redirect

        - traefik.http.routers.erp.rule=Host(`erp.xxx.com`)
        - traefik.http.routers.erp.entrypoints=webs
        - "traefik.http.routers.erp.tls.certresolver=letsencryptresolver"
        - "traefik.http.routers.erp.tls=true"
        - "traefik.http.services.erp.loadbalancer.server.port=80"
        - "traefik.http.routers.erp.service=erp"
        - "traefik.docker.network=ks_cluster_network"

but still the same. I did nsenter to swarm manager and node1 which is running erp services.


In node1 which is running erp docker service, i can capture traffic via tshark.

Traefik service has ip: 1 92.168.99.16 erp service 192.168.99.108.

From the captured packets, it looks like first connection is fine then destination unreachable.

Any idea?

Thanks,

I dont see nothing wrong in your traefik configuration now.

Could you describe you swarm cluster and how are your VM (virtualbox, cloud server, etc) ?

Also could you attach traefik service logs and service erp logs please?

docker service logs traefik_traefik
docker service logs ks_erp