Healthcheck through Traefik

Sarke · March 10, 2021, 8:54pm

How do I set up a healthcheck that connects to the same node and runs through Traefik?

My use case is that I had a healthcheck set up for a service, and it was passing because inside the service the webserver was reachable.

However, when one particular node was serving the request it would return 502 Bad Gateway. Restarting the service solved the problem.

Now if I just do a simple http check with the domain name included, I would get any one of the load-balanced nodes randomly, right? How can I hit the node that this particular container is running on?

jakubhajek · March 11, 2021, 9:40am

Hi @Sarke
Regarding a healthcheck you can use that test command (wget is added to the official Traefik image):

github.com

jakubhajek/traefik-proxy/blob/fd0deb41f79068657472c1f68a4e1d75cd809c34/basic-docker-swarm/traefik-stack.yaml#L9


# docker stack deploy traefik-stack -c traefik-stack.yaml --prune

version: "3.8"

services:
  traefik:
    image: "traefik:v2.4.5"
    healthcheck:
      test: wget --quiet --tries=1 --spider http://ping.127.0.0.1.nip.io/ping || exit 1
      interval: 10s
      timeout: 1s
      retries: 3
      start_period: 10s
    command:
      - "--global.sendAnonymousUsage=true"
      - "--api.insecure=true"
      - "--api.dashboard=true"
      - "--providers.docker=true"
      - "--providers.docker.swarmmode=true"

Hope that helps

Sarke · March 12, 2021, 2:09am

Hi @jakubhajek that is not what I am looking for.

I already have a healthcheck for Traefik, and I have a healthcheck for my service. Both healthchecks pass.

The problem is that I can still get a 502 Bad Gateway when actually trying to access the site.

I guess it has something to do with that container not being routed properly by either Traefik or Swarm. I don't know why this happens sometimes, but restarting the container solves it.

So, I need to be able to check, from the container of the service on each node, if that specific container is reachable on that node.

jakubhajek · March 12, 2021, 7:49am

Hey @Sarke

What is your current configuration?

Traefik also allows you to set health check on a service level :

https://doc.traefik.io/traefik/routing/services/#health-check

In my repositories, you can also find examples of how to implement that using Swarm.

Thanks,

Sarke · March 18, 2021, 8:01am

Hey @jakubhajek

Thanks again for the reply. the service health checks are a start, but it's not quite what I'm looking for. They will remove the node from load-balancing but it will not restart the container. I am looking for a way to use the Swarm service health checks, as it will restart containers that are not working. Taking them out of the rotation is just a temporary measure, it doesn't heal the service.

So, to re-phrase my question: If I was one of those containers, how can I wget an external url that is sure to route back to this same node?

jakubhajek · March 18, 2021, 9:12am

Hey @Sarke

Seems that your question is specifically related to the Swarm / Docker.

If you have a container with a custom app you need to prepare the healthcheck accordingly e.g. if it is the app that opens a port 8080, you can still use wget to check whether the app inside the container will respond with the status 200.

You can refer to my example configuration for Traefik healthcheck. Traefik has a built-in endpoint to validate what is the condition of the application.

Traefik has also feature to validate the condition of the service and remove unhealthy containers from the load balancer: Services - Traefik
Again, it refers to the specific endpoint created on your app (/heatlhz).

Here is an example of a similar implementation:

github.com

jakubhajek/traefik-swarm-mastery/blob/0d6bb1be42773c5d640b97b34bc2fbae70b54e02/stack-app.yml#L62


        condition: on-failure
        delay: 10s
        max_attempts: 3
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.myapp.rule=Host(`node-app.labs.cometari.eu`)"
        - "traefik.http.routers.myapp.tls.certresolver=le"
        - "traefik.http.routers.myapp.entrypoints=websecure,web"
        - "traefik.http.services.myapp.loadbalancer.server.port=80"
        - "traefik.http.services.myapp.loadbalancer.passhostheader=true"
        - "traefik.http.services.myapp.loadbalancer.healthcheck.path=/healthcheck"
        - "traefik.http.services.myapp.loadbalancer.healthcheck.interval=100ms"
        - "traefik.http.services.myapp.loadbalancer.healthcheck.timeout=75ms"
        - "traefik.http.services.myapp.loadbalancer.healthcheck.scheme=http"
      resources:
        limits:
          memory: 128MB

configs:
  nginx_config:
    name: nginx-config-${NGINX_CONFIG:-1}

I hope that helps.

Sarke · March 18, 2021, 9:43am

Thanks, but I don't think you understand what I am saying. I don't know how to explain it further, but I appreciate your help.

I think I can hack something together by using the sticky session cookie.

stephaneeybert · June 26, 2021, 7:54am

I too have a Bad Gateway error. It's my first time playing with Traefik or with a load balancer in fact.

I posted a question at https://stackoverflow.com/q/68054228/958373

I simply wonder have to have Traefik point to a newly restarted service after a failed Docker Swarm healthcheck ?

jakubhajek · June 28, 2021, 2:27pm

Hey @stephaneeybert

Thanks for using Traefik.

Referring to your question on Stack Overflow, "I would like Traefik to restart unhealthy services" - Traefik will not restart unhealthy services. This is the responsibility of cluster orchestration tools to make sure whether your services are up and running - and are ready to accept incoming requests.

On Kubernetes, you can have two types of health checks: readiness and liveness.

Readiness probe is to let Kubernetes know when your application is ready to accept incoming traffic. Kubernetes will make sure that readiness probes pass before sending requests through service to a pod. If anything is wrong with readiness probes Kubernetes will stop sending traffic to it until the readiness probes will pass again.
Liveness probes have been designed to let Kubernetes know whether your application is alive or dead. if the application is not alive Kubernetes will remove the pod and start the new instance of it.

Traefik sends requests through endpoints that are exposed thanks to the Kubernetes service.

Regarding Docker Swarm you can also create Healthcheck on a containers level and also use health check on a service level, as is described in the documentation. You can also use the order: start-first as we also described in that thread.

Thank you,

stephaneeybert · June 29, 2021, 8:07pm

Thank you Jakub for that detailed response. I shall investigate further then how to solve my bad response error by looking into these different healthcheck possibilities.

Topic		Replies	Views
Behaviour of healthchecks when using loadbalancer.healthcheck in docker swarm Traefik v2 docker-swarm	1	1686	April 19, 2022
How is heal check configured for Docker and Swarm providers Traefik v3 (latest) docker , docker-swarm	0	188	November 20, 2024
An internal healthcheck that knows traefik is actually initialized Traefik v2 docker-swarm	0	435	August 1, 2020
Question: will Traefik wait until a successful health check before routing to a Swarm node? Traefik v2 docker , docker-swarm	0	465	February 9, 2021
How to do `healthcheck` on traefik itself? Traefik v2 docker-swarm	6	18184	December 31, 2019

Healthcheck through Traefik

Related topics