Dears,
I am really new to Kubernetes / Traefik / Prometheus. I achieved to visualize a lot of metrics about my cluster with Prometheus / Grafana, and now I'd like to get metrics from Traefik.
Here are some details about my configuration:
I installed Traefik through helm, each time I want to upgrade the configuration, I use the following command:
$ helm upgrade -n kube-system traefik traefik/traefik --values values.yml
my file values.yml:
additionalArguments:
  - "--providers.kubernetesIngress.ingressClass=traefik-cert-manager"
  - "--ping"
  - "--log.level=DEBUG"
  - "--log.format=json"
  - "--metrics.prometheus"
  - "--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000"
  - "--metrics.prometheus.addEntryPointsLabels=true"
  - "--metrics.prometheus.addServicesLabels=true"
My traefik dashboard shows that it seems ready to share metrics:
For checking:
$ kubectl describe pod/traefik-64686bd987-9ws5r -n kube-system
Name:         traefik-64686bd987-9ws5r
Namespace:    kube-system
Priority:     0
Node:         ip-172-31-45-205.us-east-2.compute.internal/172.31.45.205
Start Time:   Fri, 09 Apr 2021 08:43:35 +0000
Labels:       app.kubernetes.io/instance=traefik
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=traefik
              helm.sh/chart=traefik-9.1.1
              pod-template-hash=64686bd987
Annotations:  kubernetes.io/psp: eks.privileged
Status:       Running
IP:           172.31.41.59
IPs:
  IP:           172.31.41.59
Controlled By:  ReplicaSet/traefik-64686bd987
Containers:
  traefik:
    Container ID:  docker://19f366b0441c40d67b227747ae93ddc849c1bc392b22e84005f7db4d5f9b5c49
    Image:         traefik:2.2.8
    Image ID:      docker-pullable://traefik@sha256:f5af5a5ce17fc3e353b507e8acce65d7f28126408a8c92dc3cac246d023dc9e8
    Ports:         9000/TCP, 8000/TCP, 8443/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Args:
      --global.checknewversion
      --global.sendanonymoususage
      --entryPoints.traefik.address=:9000/tcp
      --entryPoints.web.address=:8000/tcp
      --entryPoints.websecure.address=:8443/tcp
      --api.dashboard=true
      --ping=true
      --providers.kubernetescrd
      --providers.kubernetesingress
      --providers.kubernetesIngress.ingressClass=traefik-cert-manager
      --ping
      --log.level=DEBUG
      --log.format=json
      --metrics.prometheus=true
      --metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
      --metrics.prometheus.addEntryPointsLabels=true
      --metrics.prometheus.addServicesLabels=true
    State:          Running
      Started:      Fri, 09 Apr 2021 08:43:37 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:9000/ping delay=10s timeout=2s period=10s #success=1 #failure=3
    Readiness:      http-get http://:9000/ping delay=10s timeout=2s period=10s #success=1 #failure=1
    Environment:    <none>
    Mounts:
      /data from data (rw)
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from traefik-token-gktx2 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  traefik-token-gktx2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  traefik-token-gktx2
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m10s  default-scheduler  Successfully assigned kube-system/traefik-64686bd987-9ws5r to ip-172-31-45-205.us-east-2.compute.internal
  Normal  Pulled     3m9s   kubelet            Container image "traefik:2.2.8" already present on machine
  Normal  Created    3m9s   kubelet            Created container traefik
  Normal  Started    3m9s   kubelet            Started container traefik
Nevertheless, prometheus cannot reach the metrics:
Here is an extract of my prometheus.yml file:
 42     scrape_configs:
 43       - job_name: 'traefik'
 44         static_configs:
 45           - targets:
 46             - traefik.kube-system.svc.cluster.local:9000
At this point, I tried to search deeper on the internet for solution, I found this post: Capture Traefik Metrics for Apps on Kubernetes with Prometheus. It recommands to create a file  traefik-service-monitor.yaml, so I created it and applied it with kubectl
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name:  traefik
  namespace: default
  labels:
    app: traefik
    release: prometheus-stack
spec:
  jobLabel: traefik-metrics
  selector:
    matchLabels:
      app.kubernetes.io/instance: traefik
      app.kubernetes.io/name: traefik-dashboard
  namespaceSelector:
    matchNames:
    - kube-system
  endpoints:
  - port: traefik
    path: /metrics
To stick the tutorial, I also created this file: traefik-dashboard-service.yaml:
apiVersion: v1
kind: Service
metadata:
  name: traefik-dashboard
  namespace: kube-system
  labels:
    app.kubernetes.io/instance: traefik
    app.kubernetes.io/name: traefik-dashboard
spec:
  type: ClusterIP
  ports:
  - name: traefik
    port: 9000
    targetPort: traefik
    protocol: TCP
  selector:
    app.kubernetes.io/instance: traefik
    app.kubernetes.io/name: traefik
After all that, I still have the same problem: prometheus cannot reach traefik metrics.
Can someone help me with all that ? Any idea would be really appretiated !
Regards

