Application Request Tracing with Traefik and Jaeger on Kubernetes

This is the third in a series of articles on site reliability engineering (SRE) and how Traefik can help supply the monitoring and visibility that are necessary to maintain application health.

The first article discussed log analysis using tools from the Elastic stack. The second covered creating visualizations from Traefik metrics with Prometheus and Grafana. This third article explores using another open-source project, Jaeger, to perform request tracing for applications on Kubernetes.

About Tracing

Debugging anomalies, bottlenecks, and performance issues is a challenge in distributed architectures, such as microservices. Each user request typically involves the collaboration of many services to deliver the intended outcome. Because traditional monitoring methods like application logs and metrics tend to target monolithic applications, they can fail to capture the full performance trail for every request.

Distributed tracing, therefore, is an important profiling technique that complements log monitoring and metrics. It captures the transaction flow across various application components and services involved in processing a user request. The captured data can then be visualized to show which component malfunctioned and caused an issue, such as an error or bottleneck.

This post demonstrates how to integrate Traefik with Jaeger, an open-source tracing application that's a project of the Cloud Native Computing Foundation. The integration will capture traces for user requests across the various components of a hypothetical application running on a Kubernetes cluster.

Prerequisites

This post will walk you through the process of integrating Traefik and Jaeger, but you'll need to have a few things setup first:

  1. A Kubernetes cluster running at localhost. The Traefik Labs team often uses k3d for this purpose, which creates a local cluster in Docker containers. However, k3d comes bundles with the latest version of k3s, and k3s comes packaged with Traefik ver 1.7, which you'll want to disable so you can use the latest version. The following command creates the cluster and exposes it on port 8081:

    k3d cluster create dev -p "8081:80@loadbalancer" --k3s-server-arg --disable=traefik
    
  2. The kubectl command-line tool, configured to point to your cluster. (If you created your cluster using K3d and the instructions above, this will already be done for you.)

  3. A recent version of the Helm package manager for Kubernetes.

  4. The set of configuration files that accompany this article, which are available on GitHub:

    git clone https://github.com/traefik-tech-blog/traefik-sre-tracing/
    

You do not need to have Traefik 2.x preinstalled, as you'll do that along the way.

Set Up Tracing

First, you'll need to install and configure Jaeger on your Kubernetes cluster. The simplest way is to use the official Helm chart. As a first step, add the jaegertracing repository to your Helm repo list and update its contents:

helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
helm repo update

The Jaeger repository provides two charts: jaeger and jaeger-operator. For the purpose of this discussion, you'll deploy the jaeger-operator chart, which makes it easy to configure a minimal installation. To learn more about the Jaeger Operator for Kubernetes, consult the official documentation.

helm install jaeger-op jaegertracing/jaeger-operator

Minimal Deployment

Deploying Jaeger in all its details is a topic well beyond the scope of this article. We will deploy Jaeger with AllInOne topology using the below configuration, which will be sufficient to demonstrate the integration:

# jaeger.yaml
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger

The above configuration will create an instance named jaeger. It will also create a query-ui, an agent, and a collector. All these related services are prefixed with jaeger name. It will not deploy a database like Cassandra or Elastic instead, it will rely on in-memory data processing.

kubectl apply -f jaeger.yaml

You can confirm Jaeger is running by doing a lookup of all deployed services:

$ kubectl get services
NAME                                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                  AGE
kubernetes                          ClusterIP   10.43.0.1       <none>        443/TCP                                  76m
jaeger-op-jaeger-operator-metrics   ClusterIP   10.43.86.167    <none>        8383/TCP,8686/TCP                        82s
jaeger-collector-headless           ClusterIP   None            <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   47s
jaeger-collector                    ClusterIP   10.43.163.147   <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   47s
jaeger-query                        ClusterIP   10.43.27.251    <none>        16686/TCP                                47s
jaeger-agent                        ClusterIP   None            <none>        5775/UDP,5778/TCP,6831/UDP,6832/UDP      47s

Install and Configure Traefik

Now it's time to deploy Traefik, which you'll do using the official Helm chart. If you haven't already, add Traefik Labs to your Helm repository list using the below commands:

helm repo add traefik https://helm.traefik.io/traefik
helm repo update

Next you'll deploy the latest version of Traefik in the kube-system namespace. For this demo, however, the standard configuration of the Helm chart won't be enough. As part of the deployment, you need to ensure that Jaeger integration is enabled in Traefik. You do this by passing additionalArguments configuration flags in the traefik-values.yaml file:

  - "--tracing.jaeger=true"
  - "--tracing.jaeger.samplingServerURL=http://jaeger-agent.default.svc:5778/sampling"
  - "--tracing.jaeger.localAgentHostPort=jaeger-agent.default.svc:6831"

As shown in the above configuration, you need to provide an address for the Jaeger agent. By default, this is localhost, and if you deploy jaeger-agent as a sidecar, this works as expected. In this deployment, however, you need to provide an explicit address for jaeger-agent, which corresponds to the jaeger-agent.default.svc hostname that was configured by the Helm chart.

Use the Helm chart to deploy Traefik into the kube-system namespace with the configuration options for Jaeger, like so:

helm install traefik traefik/traefik -n kube-system -f ./traefik-values.yaml

Once the pods are created, you can verify the Jaeger integration by using port forwarding to expose the Traefik dashboard:

kubectl -n kube-system port-forward $(kubectl -n kube-system get pods --selector "app.kubernetes.io/name=traefik" --output=name) 9000:9000

If you access the Traefik dashboard at http://localhost:9000/dashboard/, you will see that Jaeger tracing is enabled under the Features section:

Now is also a good time to expose the Jaeger UI, which is served on port 16686:

kubectl  port-forward service/jaeger-query 16686:16686

When you access the Jaeger dashboard at http://localhost:16686/, you will see traefik in the Service pull-down, and the Traefik endpoints will be listed in the Operations pull-down:

Deploy Hot R.O.D.

Now that your integration is working, you need an application to trace. For this purpose, you should deploy Hot R.O.D. - Rides On Demand, which is an example application created by the Jaeger team. It is a demo ride-booking service that consists of three microservices: driver-service, customer-service, and route-service. Each service also has accompanying storage, such as a MySQL database or Redis cache.

The application includes four pre-built "customer personas" who can book a ride using the application UI. When a car is booked, the application will find a driver and dispatch the car.

Throughout the process, Jaeger will capture the user request as it flows through the various services (driver-service, customer-service, route-service). Individual service handling will be shown as a "span," and all related spans are visualized in a graph known as the "trace."

Deploy the Service along with the IngressRoute using the following configuration file:

$ kubectl apply -f hotrod.yaml
deployment.apps/hotrod created
service/hotrod created
ingressroute.traefik.containo.us/hotrod created

The hotrod route will match the hostname hotrod.localhost, which allows you to open the application UI. (If you used K3d to create a demo cluster at the start of this tutorial, recall that it is exposed on port 8081.)

In the above UI you can see the four prebuilt customer personas. This UI is not required for this tracing demo, however, as you can use command-line tools.

Application Traces

To see Jaeger in action, send a few user requests to the application using a sample customer persona. For example, try the following curl commands:

curl -I "http://localhost:8081/dispatch?customer=392" -H "host:hotrod.localhost"

curl -I "http://localhost:8081/dispatch?customer=123" -H "host:hotrod.localhost"

Each command triggered a sequence of requests to produce the expected result. You can see the generated traces in the Jaeger UI when you select traefik as the Service and hotrod.localhost as the Operation and click Find Traces:

You can select either of the traces to explore the detailed request flow.

The display above shows the top two spans expanded to show the information forwarded by Traefik. Each span shows the request duration, along with non-mandatory sections for Tags, Process, and Logs. The Tags section contains key-value pairs that can be associated with request handling.

The Tags field of the topmost traefik span shows information related to HTTP handling, such as the status code, URL, host, and so on. The next span shows the routing information for the request, including the router name and service name.

Jaeger can also deduce an overall architecture by analyzing the request traces. This diagram is available under the System Architecture > DAG tab:

The graph shows that you made two requests, which were routed to the frontend service. The frontend service then fanned out requests to the customer, driver, and route services.

Returning to the Search tab of the Jaeger UI, you can see that in the current cluster you have traces generated for the following three entrypoints :

  • traefik-dashboard, which you used for lookup
  • ping api, used by Kubernetes for health checks
  • hotrod.localhost, used by the Hot R.O.D. application

As you deploy more applications to your cluster, you will see more entries in the Operations drop-down, based on the entrypoint match.

Wrap Up

This post has presented a very simple demonstration of how to integrate Traefik with Jaeger. There is much more to explore with Jaeger, and similar integrations can be done with other tracing systems, such as Zipkin and Datadog. Whichever one you choose, Traefik makes it easy to follow the progress of each request and gain insights into application flow.

We hope you've enjoyed this series of articles on how Traefik's capabilities can enable app monitoring and health analysis for SRI. If you missed the earlier installments on log aggregation and metrics, respectively, be sure to take a look. All three articles demonstrate how readily available open-source software, including Traefik, can empower practices that both increase app uptime and contribute to improving the design of distributed systems.

If you'd like to explore Traefik's monitoring and visibility features even further, check out Traefik Pilot, the SaaS monitoring and management platform from Traefik Labs.


This is a companion discussion topic for the original entry at https://traefik.io/blog/application-request-tracing-with-traefik-and-jaeger-on-kubernetes/

Hi, thanks for your work!
The link to your Github repository in Prerequesites section seems to be broken (or repository is private ?), would be great to be fixed.

Regards

Hello @marmorag

Thank you for going through that blog post and indicating the issue with the Github repo visibility.

It has been just fixed, so you should able to clone the repo.

Thank you once again for creating the issue and let me know if you will any other issue concerning that blog post or any other Traefik related topics.

Kind regards,
Jakub

Hi !

Apparently, the Prerequisites section doesn't reflect the latest way to create a cluster without traefik.

k3d cluster create dev -p "8081:80@loadbalancer" --k3s-server-arg --disable=traefik

According to this thread in version 5 you should use :

k3d cluster create dev -p "8081:80@loadbalancer" --k3s-arg "--disable=traefik@server:*"

Hi again :wink: So far, your article is useful, thanks for the work.

Here an issue I ran into :

helm install traefik traefik/traefik -n kube-system -f ./traefik-values.yaml

This command led me to this error : Error: INSTALLATION FAILED: failed to parse ./traefik-values.yaml: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal array into Go value of type map[string]interface {}

I fixed it by using the following values for the file traefik-values.yaml :

# traefik-values.yaml
tracing:
  jaeger:
    samplingServerURL: http://jaeger-agent.default.svc:5778/sampling
    localAgentHostPort: jaeger-agent.default.svc:6831