How to setup health checks for Traefik?

Ok guys, I am evaluating Ingresses for Kubernetes. I just dropped in NGINX and it works off the bat health checks included on GCP. Any issues I had the folks on Kubernetes slack were there to help right away. Now for traefik ....

Docs really dont show how to setup health check at particular url like /health or /healthz or is it ping? Not really well documented at all. Next as far as community my question has been sitting without a response for more than 24 hours which is really slow.

Bottom line is document it or respond in community forum or your going to lose customers. Its as simple as that. To use something like this in production docs have got to be there or a community that will answer questions of this type. As an architect who works with Fortune 500 companies I can tell you that I will look at it again but either the docs or the support has got to improve. My advice is to assist people with their initial setups and make this as easy a process as possible especially on the most popular Kubernetes platforms on earth. No reason not to have an install guide for gcp or aws that is current. No reason not to have simple drop in service to allow health checks on install.

Same question for swarm :slight_smile:

1 Like

Hello @hanoisteve, there is a lot going on in this post, so I will address them individually.

I am evaluating Ingresses for Kubernetes

That is great to hear! I assume that you are wanting to evaluate ingress controllers for use in a production environment, and therefore have some critical systems that you are fronting.

I just dropped in NGINX and it works off the bat health checks included on GCP. Any issues I had the folks on Kubernetes slack were there to help right away

That is also great to hear, as nginx is the default controller for kubernetes, and has been for quite a while.

Docs really dont show how to setup health check at particular url like /health or /healthz or is it ping? Not really well documented at all

Healthchecks are done via the ping provider if you want to use http healthchecks
(https://docs.traefik.io/v1.7/configuration/ping/)
Healthchecks are also available via an executable argument for systems that don’t have curl-based healthchecks
(https://docs.traefik.io/v1.7/basics/#command-healthcheck)

Next as far as community my question has been sitting without a response for more than 24 hours which is really slow.

You have posted multiple questions over the last few days, and multiple members of our team have taken the time to respond and provide code examples and advice. This is also over a weekend, and we have multiple teams in different time zones.

Bottom line is document it or respond in community forum or your going to lose customers. Its as simple as that.

I don’t want to be contrarian, but our documentation is some of the best in the industry. Being an open source project, most of the documentation has been contributed by the members of the community, and our project team curates for accuracy. If you think the documentation is lacking, feel free to open a PR to improve it!

As for losing customers, our customers all have support contracts for commercial grade support. If you are doing an evaluation for a large deployment, than I would advise you reach out to our sales team at: (https://containo.us/services/#commercial-support), and we can work with you to get your deployment running smoothly.

As an architect who works with Fortune 500 companies I can tell you that I will look at it again but either the docs or the support has got to improve

Again, if you are seriously evaluating Traefik CE/EE for a commercial deployment, you should know the value of commercial support, and should reach out to us directly, instead of posting on the community forums and complaining about turnaround time.

The bottom line is that we are unable to provide on-demand support for everyone that downloads our OSS software. We have over 500 MILLION downloads on dockerhub alone, and can’t possibly answer every question every user has.

We take our customer’s needs very seriously, and we strive to provide them the best support possible, however, unfortunately that means answering our paying customers’ questions first, and then the community’s questions when we are able to make time.

I will leave you with this question:

Being that Traefik CE is Open-Source, and OSS is a give and take ecosystem, what have you contributed to the project that should afford you priority responses to the questions you may have?

We appreciate that you may have encountered some configuration issues during your time evaluating Traefik, but coming to our community forum and bad-mouthing our team and our company is not going to help your case at all.

2 Likes

Well I have seen the install documents you listed. Let me explain why this documentation is not clear to a new comer like myself.

[backends]
[backends.backend1]
[backends.backend1.healthcheck]
path = “/health”
interval = “10s”
port = 8080

In a sense this is exactly what I need and thought I was done at this point.
But then I was thinking what is at backends.backend1.healthcheck ?
Is there a service there? Do I need to add one? or will ping somehow respond because of this configuration?

I only know that ping CAN be enabled. But does this map /health to ping? Or do I need to create a dummy service backend that responds with 200?

Secondly, on initial install many of us are using arguments to startup the server. So translating this to command arguments is a second issue if that option even exists. So why this documentation did not work for me is because I fail to see how declaring this backend alone would cause health checks to go through. what if I do backend3.healthcheck does that enable health checking too? These may sound like dumb questions, and I think if I took timeout to study the backend and front end configuration logic used I could have figured it out eventually. But on install I just want the thing up and running and the admin panel visible. I dont view this as bad mouthing. I am just stating the facts. As far as who should get priority again my recommendation is that getting people initially setup is a priority. Then they will have more advanced questions and time to research how the product works in detail, but if its not running and the customer cannot get to the admin panel thats a serious issue in adoption of Traefik. Thats all I am saying. The product may well be exactly what we need but these initial configuration steps are just too time consuming without more clear documentation of the steps for installing on the major cloud platforms.

I would suggestion:

  1. Installation guide for bare metal cluster.
  2. Installation guide for minikube
  3. Installation guide for GCP
  4. Installation guide for AWS

I am not the guy to write the 3rd one for obvious reasons. I did get it running on bare metal cluster.

PS: There are some online guides and it can be done, but these are not up to date or they may involve many other steps like creating volumes, TLS enablement etc. That differ quite a bit from customer to customer in terms of usage.