Vector agent cannot connect to vector aggregator while going through Traefik

Hi,

I'm trying to setup an infrastructure where vector (https://vector.dev/) agents connect to vector aggregators on the cloud through Traefik.

[CLIENT PREMISE vector agents] -> [CLOUD Traefik :6000/tcp -> HAProxy service :6000/tcp -> vector aggregator :6000/tcp ]

But it seems like when vector goes through Traefik, it refuses to connect to the proxy service:

2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/configurationwatcher.go:227 > Configuration received config={"http":{"routers":{"mcms-mcms-api-api":{"entryPoints":["web"],"middlewares":["mcms-mcms-api-strip-prefix@kubernetescrd"],"rule":"PathPrefix(`/api`)","service":"mcms-mcms-api-api"},"mcms-mcms-ui":{"entryPoints":["web"],"middlewares":["mcms-mcms-api-strip-prefix@kubernetescrd"],"rule":"PathPrefix(`/`)","service":"mcms-mcms-ui-web"},"mcms-vector-ingress":{"entryPoints":["vector"],"rule":"PathPrefix(`/`)","service":"mcms-vector-haproxy-vector"}},"services":{"mcms-mcms-api-api":{"loadBalancer":{"passHostHeader":true,"responseForwarding":{"flushInterval":"100ms"},"servers":[{"url":"http://10.244.0.45:3000"}]}},"mcms-mcms-ui-web":{"loadBalancer":{"passHostHeader":true,"responseForwarding":{"flushInterval":"100ms"},"servers":[{"url":"http://10.244.0.44:80"}]}},"mcms-vector-haproxy-vector":{"loadBalancer":{"passHostHeader":true,"responseForwarding":{"flushInterval":"100ms"},"servers":[{"url":"http://10.244.0.160:6000"}]}}}},"tcp":{},"udp":{}} providerName=kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/tls/tlsmanager.go:321 > No default certificate, fallback to the internal generated certificate tlsStoreName=default
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/service/service.go:299 > Creating load-balancer entryPointName=web routerName=mcms-mcms-ui@kubernetes serviceName=mcms-mcms-ui-web@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/service/service.go:336 > Creating server entryPointName=web routerName=mcms-mcms-ui@kubernetes serverName=1a7527c3867f3cb4 serviceName=mcms-mcms-ui-web@kubernetes target=http://10.244.0.44:80
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/stripprefix/strip_prefix.go:32 > Creating middleware entryPointName=web middlewareName=mcms-mcms-api-strip-prefix@kubernetescrd middlewareType=StripPrefix routerName=mcms-mcms-ui@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/observability/middleware.go:33 > Adding tracing to middleware entryPointName=web middlewareName=mcms-mcms-api-strip-prefix@kubernetescrd routerName=mcms-mcms-ui@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/service/service.go:299 > Creating load-balancer entryPointName=web routerName=mcms-mcms-api-api@kubernetes serviceName=mcms-mcms-api-api@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/service/service.go:336 > Creating server entryPointName=web routerName=mcms-mcms-api-api@kubernetes serverName=3b45646bc08c4cc2 serviceName=mcms-mcms-api-api@kubernetes target=http://10.244.0.45:3000
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/stripprefix/strip_prefix.go:32 > Creating middleware entryPointName=web middlewareName=mcms-mcms-api-strip-prefix@kubernetescrd middlewareType=StripPrefix routerName=mcms-mcms-api-api@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/observability/middleware.go:33 > Adding tracing to middleware entryPointName=web middlewareName=mcms-mcms-api-strip-prefix@kubernetescrd routerName=mcms-mcms-api-api@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/recovery/recovery.go:25 > Creating middleware entryPointName=web middlewareName=traefik-internal-recovery middlewareType=Recovery
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/service/service.go:299 > Creating load-balancer entryPointName=vector routerName=mcms-vector-ingress@kubernetes serviceName=mcms-vector-haproxy-vector@kubernetes
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/server/service/service.go:336 > Creating server entryPointName=vector routerName=mcms-vector-ingress@kubernetes serverName=8643cfa1fcfeaf4d serviceName=mcms-vector-haproxy-vector@kubernetes target=http://10.244.0.160:6000
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/recovery/recovery.go:25 > Creating middleware entryPointName=vector middlewareName=traefik-internal-recovery middlewareType=Recovery
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/stripprefix/strip_prefix.go:32 > Creating middleware entryPointName=traefik middlewareName=dashboard_stripprefix@internal middlewareType=StripPrefix routerName=dashboard@internal
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/observability/middleware.go:33 > Adding tracing to middleware entryPointName=traefik middlewareName=dashboard_stripprefix@internal routerName=dashboard@internal
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/redirect/redirect_regex.go:17 > Creating middleware entryPointName=traefik middlewareName=dashboard_redirect@internal middlewareType=RedirectRegex routerName=dashboard@internal
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/redirect/redirect_regex.go:18 > Setting up redirection from ^(http:\/\/(\[[\w:.]+\]|[\w\._-]+)(:\d+)?)\/$ to ${1}/dashboard/ entryPointName=traefik middlewareName=dashboard_redirect@internal middlewareType=RedirectRegex routerName=dashboard@internal
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/observability/middleware.go:33 > Adding tracing to middleware entryPointName=traefik middlewareName=dashboard_redirect@internal routerName=dashboard@internal
2024-11-19T23:47:22Z DBG github.com/traefik/traefik/v3/pkg/middlewares/recovery/recovery.go:25 > Creating middleware entryPointName=traefik middlewareName=traefik-internal-recovery middlewareType=Recovery
2024-11-19T23:48:01Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 8643cfa1fcfeaf4d
2024-11-19T23:48:01Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 502 Bad Gateway error=EOF
10.244.0.1 - - [19/Nov/2024:23:48:01 +0000] "POST /vector.Vector/HealthCheck HTTP/2.0" 502 11 "-" "-" 604 "mcms-vector-ingress@kubernetes" "http://10.244.0.160:6000" 5ms
2024-11-19T23:48:01Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 8643cfa1fcfeaf4d
2024-11-19T23:48:01Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 502 Bad Gateway error="write tcp 10.244.0.159:45050->10.244.0.160:6000: use of closed network connection"
10.244.0.1 - - [19/Nov/2024:23:48:01 +0000] "POST /vector.Vector/PushEvents HTTP/2.0" 502 11 "-" "-" 605 "mcms-vector-ingress@kubernetes" "http://10.244.0.160:6000" 1ms
2024-11-19T23:48:01Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 8643cfa1fcfeaf4d
2024-11-19T23:48:01Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 502 Bad Gateway error="write tcp 10.244.0.159:45062->10.244.0.160:6000: use of closed network connection"
10.244.0.1 - - [19/Nov/2024:23:48:01 +0000] "POST /vector.Vector/PushEvents HTTP/2.0" 502 11 "-" "-" 606 "mcms-vector-ingress@kubernetes" "http://10.244.0.160:6000" 0ms
2024-11-19T23:48:02Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 8643cfa1fcfeaf4d
2024-11-19T23:48:02Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 502 Bad Gateway error="write tcp 10.244.0.159:45074->10.244.0.160:6000: write: broken pipe"
10.244.0.1 - - [19/Nov/2024:23:48:02 +0000] "POST /vector.Vector/PushEvents HTTP/2.0" 502 11 "-" "-" 607 "mcms-vector-ingress@kubernetes" "http://10.244.0.160:6000" 0ms
2024-11-19T23:48:03Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:196 > Service selected by WRR: 8643cfa1fcfeaf4d
2024-11-19T23:48:03Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:113 > 502 Bad Gateway error="write tcp 10.244.0.159:45084->10.244.0.160:6000: use of closed network connection"

I can see the entrypoint and service in the dashboard:

These are the services running:

NAME                    TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)             AGE
grafana                 ClusterIP      10.103.35.112    <none>           80/TCP              4d23h
mcms-api                ClusterIP      10.111.14.173    <none>           3000/TCP            7d
mcms-es-cold            ClusterIP      None             <none>           9200/TCP            6d18h
mcms-es-hot             ClusterIP      None             <none>           9200/TCP            6d18h
mcms-es-http            ClusterIP      10.100.42.80     <none>           9200/TCP            6d18h
mcms-es-internal-http   ClusterIP      10.97.10.45      <none>           9200/TCP            6d18h
mcms-es-transport       ClusterIP      None             <none>           9300/TCP            6d18h
mcms-es-warm            ClusterIP      None             <none>           9200/TCP            6d18h
mcms-kb-http            ClusterIP      10.109.197.34    <none>           5601/TCP            6d18h
mcms-master-nodes       ClusterIP      10.99.126.91     <none>           9200/TCP            6d18h
mcms-ui                 ClusterIP      10.103.42.48     <none>           80/TCP              7d22h
traefik-dashboard       LoadBalancer   10.104.145.186   10.104.145.186   8080:31762/TCP      7d22h
traefik-vector          LoadBalancer   10.104.245.201   10.104.245.201   6000:32462/TCP      38h
traefik-web             LoadBalancer   10.110.225.94    10.110.225.94    80:31343/TCP        7d22h
vector                  ClusterIP      10.106.78.189    <none>           6000/TCP,9090/TCP   6d15h
vector-haproxy          ClusterIP      10.98.195.94     <none>           6000/TCP,9090/TCP   16h
vector-headless         ClusterIP      None             <none>           6000/TCP,9090/TCP   6d15h

These are the endpoints:

NAME                    ENDPOINTS                                                           AGE
grafana                 10.244.0.154:3000                                                   4d23h
mcms-api                10.244.0.45:3000                                                    7d
mcms-es-cold            10.244.0.141:9200                                                   6d18h
mcms-es-hot             10.244.0.134:9200,10.244.0.135:9200,10.244.0.136:9200               6d18h
mcms-es-http            10.244.0.134:9200,10.244.0.135:9200,10.244.0.136:9200 + 3 more...   6d18h
mcms-es-internal-http   10.244.0.134:9200,10.244.0.135:9200,10.244.0.136:9200 + 3 more...   6d18h
mcms-es-transport       10.244.0.134:9300,10.244.0.135:9300,10.244.0.136:9300 + 3 more...   6d18h
mcms-es-warm            10.244.0.142:9200,10.244.0.143:9200                                 6d18h
mcms-kb-http            10.244.0.80:5601                                                    6d18h
mcms-master-nodes       <none>                                                              6d18h
mcms-ui                 10.244.0.44:80                                                      7d22h
traefik-dashboard       10.244.0.159:8080                                                   7d22h
traefik-vector          10.244.0.159:6000                                                   38h
traefik-web             10.244.0.159:80                                                     7d22h
vector                  10.244.0.139:6000,10.244.0.140:6000,10.244.0.139:9090 + 1 more...   6d15h
vector-haproxy          10.244.0.160:6000,10.244.0.160:9090                                 15h
vector-headless         10.244.0.139:6000,10.244.0.140:6000,10.244.0.139:9090 + 1 more...   6d15h

This is the HAProxy service:

Name:                     vector-haproxy
Namespace:                mcms
Labels:                   app.kubernetes.io/component=load-balancer
                          app.kubernetes.io/instance=vector
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=vector
                          app.kubernetes.io/version=2.6.12
                          helm.sh/chart=vector-0.37.0
Annotations:              meta.helm.sh/release-name: vector
                          meta.helm.sh/release-namespace: mcms
Selector:                 app.kubernetes.io/component=load-balancer,app.kubernetes.io/instance=vector,app.kubernetes.io/name=vector
Type:                     ClusterIP
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.98.195.94
IPs:                      10.98.195.94
Port:                     vector  6000/TCP
TargetPort:               6000/TCP
Endpoints:                10.244.0.160:6000
Port:                     prom-exporter  9090/TCP
TargetPort:               9090/TCP
Endpoints:                10.244.0.160:9090
Session Affinity:         None
Internal Traffic Policy:  Cluster
Events:                   <none>

This is the vector ingress :

Name:             vector-ingress
Labels:           <none>
Namespace:        mcms
Address:          
Ingress Class:    <none>
Default backend:  <default>
Rules:
  Host        Path  Backends
  ----        ----  --------
  *           
              /   vector-haproxy:vector (10.244.0.160:6000)
Annotations:  traefik.ingress.kubernetes.io/router.entrypoints: vector
Events:       <none>

I've also confirmed that from the Traefik pod, I can successfully connect with a vector agent to the vector aggregator through HAProxy:

/ # cat vector.yaml 
sources:
  dummy:
    type: demo_logs
    format: json
sinks:
  vector:
    type: vector
    inputs: [dummy]
    address: 10.244.0.160:6000

/ # /root/.vector/bin/vector -c vector.yaml 
2024-11-20T00:08:56.312067Z  INFO vector::app: Log level is enabled. level="info"
2024-11-20T00:08:56.317850Z  INFO vector::app: Loading configs. paths=["vector.yaml"]
2024-11-20T00:08:56.378145Z  INFO vector::topology::running: Running healthchecks.
2024-11-20T00:08:56.378429Z  INFO vector: Vector has started. debug="false" version="0.42.0" arch="x86_64" revision="3d16e34 2024-10-21 14:10:14.375255220"
2024-11-20T00:08:56.378447Z  INFO vector::app: API is disabled, enable by setting `api.enabled` to `true` and use commands like `vector top`.
2024-11-20T00:08:56.441257Z  INFO vector::topology::builder: Healthcheck passed.

So I'm not sure if it's some network policy or I just didn't set up the Traefik entrypoint and the rest of necessary resources accordingly.

Some help would be greatly appreciated.

Regards,
Roland.

Ah, it turns out that vector 2 vector traffic goes over gRPC ... which means that enabling h2c for the service is required (as noted in Kubernetes Ingress Routing Configuration - Traefik).

Adding that to the service annotations seems to make it work!