This is the third in a series of articles on site reliability engineering (SRE) and how Traefik can help supply the monitoring and visibility that are necessary to maintain application health.
The first article discussed log analysis using tools from the Elastic stack. The second covered creating visualizations from Traefik metrics with Prometheus and Grafana. This third article explores using another open-source project, Jaeger, to perform request tracing for applications on Kubernetes.
About Tracing
Debugging anomalies, bottlenecks, and performance issues is a challenge in distributed architectures, such as microservices. Each user request typically involves the collaboration of many services to deliver the intended outcome. Because traditional monitoring methods like application logs and metrics tend to target monolithic applications, they can fail to capture the full performance trail for every request.
Distributed tracing, therefore, is an important profiling technique that complements log monitoring and metrics. It captures the transaction flow across various application components and services involved in processing a user request. The captured data can then be visualized to show which component malfunctioned and caused an issue, such as an error or bottleneck.
This post demonstrates how to integrate Traefik with Jaeger, an open-source tracing application that's a project of the Cloud Native Computing Foundation. The integration will capture traces for user requests across the various components of a hypothetical application running on a Kubernetes cluster.
Prerequisites
This post will walk you through the process of integrating Traefik and Jaeger, but you'll need to have a few things setup first:
-
A Kubernetes cluster running at
localhost
. The Traefik Labs team often uses k3d for this purpose, which creates a local cluster in Docker containers. However, k3d comes bundles with the latest version of k3s, andk3s
comes packaged with Traefik ver 1.7, which you'll want to disable so you can use the latest version. The following command creates the cluster and exposes it on port 8081:k3d cluster create dev -p "8081:80@loadbalancer" --k3s-server-arg --disable=traefik
-
The
kubectl
command-line tool, configured to point to your cluster. (If you created your cluster using K3d and the instructions above, this will already be done for you.) -
A recent version of the Helm package manager for Kubernetes.
-
The set of configuration files that accompany this article, which are available on GitHub:
git clone https://github.com/traefik-tech-blog/traefik-sre-tracing/
You do not need to have Traefik 2.x preinstalled, as you'll do that along the way.
Set Up Tracing
First, you'll need to install and configure Jaeger on your Kubernetes cluster. The simplest way is to use the official Helm chart. As a first step, add the jaegertracing
repository to your Helm repo list and update its contents:
helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
helm repo update
The Jaeger repository provides two charts: jaeger
and jaeger-operator
. For the purpose of this discussion, you'll deploy the jaeger-operator
chart, which makes it easy to configure a minimal installation. To learn more about the Jaeger Operator for Kubernetes, consult the official documentation.
helm install jaeger-op jaegertracing/jaeger-operator
Minimal Deployment
Deploying Jaeger in all its details is a topic well beyond the scope of this article. We will deploy Jaeger with AllInOne
topology using the below configuration, which will be sufficient to demonstrate the integration:
# jaeger.yaml
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: jaeger
The above configuration will create an instance named jaeger
. It will also create a query-ui
, an agent
, and a collector
. All these related services are prefixed with jaeger
name. It will not deploy a database like Cassandra
or Elastic
instead, it will rely on in-memory
data processing.
kubectl apply -f jaeger.yaml
You can confirm Jaeger is running by doing a lookup of all deployed services:
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 76m
jaeger-op-jaeger-operator-metrics ClusterIP 10.43.86.167 <none> 8383/TCP,8686/TCP 82s
jaeger-collector-headless ClusterIP None <none> 9411/TCP,14250/TCP,14267/TCP,14268/TCP 47s
jaeger-collector ClusterIP 10.43.163.147 <none> 9411/TCP,14250/TCP,14267/TCP,14268/TCP 47s
jaeger-query ClusterIP 10.43.27.251 <none> 16686/TCP 47s
jaeger-agent ClusterIP None <none> 5775/UDP,5778/TCP,6831/UDP,6832/UDP 47s
Install and Configure Traefik
Now it's time to deploy Traefik, which you'll do using the official Helm chart. If you haven't already, add Traefik Labs to your Helm repository list using the below commands:
helm repo add traefik https://helm.traefik.io/traefik
helm repo update
Next you'll deploy the latest version of Traefik in the kube-system
namespace. For this demo, however, the standard configuration of the Helm chart won't be enough. As part of the deployment, you need to ensure that Jaeger integration is enabled in Traefik. You do this by passing additionalArguments
configuration flags in the traefik-values.yaml
file:
- "--tracing.jaeger=true"
- "--tracing.jaeger.samplingServerURL=http://jaeger-agent.default.svc:5778/sampling"
- "--tracing.jaeger.localAgentHostPort=jaeger-agent.default.svc:6831"
As shown in the above configuration, you need to provide an address for the Jaeger agent. By default, this is localhost
, and if you deploy jaeger-agent
as a sidecar, this works as expected. In this deployment, however, you need to provide an explicit address for jaeger-agent
, which corresponds to the jaeger-agent.default.svc
hostname that was configured by the Helm chart.
Use the Helm chart to deploy Traefik into the kube-system
namespace with the configuration options for Jaeger, like so:
helm install traefik traefik/traefik -n kube-system -f ./traefik-values.yaml
Once the pods are created, you can verify the Jaeger integration by using port forwarding to expose the Traefik dashboard:
kubectl -n kube-system port-forward $(kubectl -n kube-system get pods --selector "app.kubernetes.io/name=traefik" --output=name) 9000:9000
If you access the Traefik dashboard at http://localhost:9000/dashboard/, you will see that Jaeger tracing is enabled under the Features section:
Now is also a good time to expose the Jaeger UI, which is served on port 16686:
kubectl port-forward service/jaeger-query 16686:16686
When you access the Jaeger dashboard at http://localhost:16686/, you will see traefik
in the Service pull-down, and the Traefik endpoints will be listed in the Operations pull-down:
Deploy Hot R.O.D.
Now that your integration is working, you need an application to trace. For this purpose, you should deploy Hot R.O.D. - Rides On Demand, which is an example application created by the Jaeger team. It is a demo ride-booking service that consists of three microservices: driver-service
, customer-service
, and route-service
. Each service also has accompanying storage, such as a MySQL database or Redis cache.
The application includes four pre-built "customer personas" who can book a ride using the application UI. When a car is booked, the application will find a driver and dispatch the car.
Throughout the process, Jaeger will capture the user request as it flows through the various services (driver-service
, customer-service
, route-service
). Individual service handling will be shown as a "span," and all related spans are visualized in a graph known as the "trace."
Deploy the Service along with the IngressRoute using the following configuration file:
$ kubectl apply -f hotrod.yaml
deployment.apps/hotrod created
service/hotrod created
ingressroute.traefik.containo.us/hotrod created
The hotrod
route will match the hostname hotrod.localhost
, which allows you to open the application UI. (If you used K3d to create a demo cluster at the start of this tutorial, recall that it is exposed on port 8081.)
In the above UI you can see the four prebuilt customer personas. This UI is not required for this tracing demo, however, as you can use command-line tools.
Application Traces
To see Jaeger in action, send a few user requests to the application using a sample customer persona. For example, try the following curl
commands:
curl -I "http://localhost:8081/dispatch?customer=392" -H "host:hotrod.localhost"
curl -I "http://localhost:8081/dispatch?customer=123" -H "host:hotrod.localhost"
Each command triggered a sequence of requests to produce the expected result. You can see the generated traces in the Jaeger UI when you select traefik
as the Service and hotrod.localhost
as the Operation and click Find Traces:
You can select either of the traces to explore the detailed request flow.
The display above shows the top two spans expanded to show the information forwarded by Traefik. Each span shows the request duration, along with non-mandatory sections for Tags, Process, and Logs. The Tags section contains key-value pairs that can be associated with request handling.
The Tags field of the topmost traefik
span shows information related to HTTP handling, such as the status code, URL, host, and so on. The next span shows the routing information for the request, including the router name and service name.
Jaeger can also deduce an overall architecture by analyzing the request traces. This diagram is available under the System Architecture > DAG
tab:
The graph shows that you made two requests, which were routed to the frontend
service. The frontend
service then fanned out requests to the customer
, driver
, and route
services.
Returning to the Search
tab of the Jaeger UI, you can see that in the current cluster you have traces generated for the following three entrypoints :
-
traefik-dashboard
, which you used for lookup -
ping api
, used by Kubernetes for health checks -
hotrod.localhost
, used by the Hot R.O.D. application
As you deploy more applications to your cluster, you will see more entries in the Operations drop-down, based on the entrypoint
match.
Wrap Up
This post has presented a very simple demonstration of how to integrate Traefik with Jaeger. There is much more to explore with Jaeger, and similar integrations can be done with other tracing systems, such as Zipkin and Datadog. Whichever one you choose, Traefik makes it easy to follow the progress of each request and gain insights into application flow.
We hope you've enjoyed this series of articles on how Traefik's capabilities can enable app monitoring and health analysis for SRI. If you missed the earlier installments on log aggregation and metrics, respectively, be sure to take a look. All three articles demonstrate how readily available open-source software, including Traefik, can empower practices that both increase app uptime and contribute to improving the design of distributed systems.
If you'd like to explore Traefik's monitoring and visibility features even further, check out Traefik Pilot, the SaaS monitoring and management platform from Traefik Labs.
This is a companion discussion topic for the original entry at https://traefik.io/blog/application-request-tracing-with-traefik-and-jaeger-on-kubernetes/