Intermittent mTLS peer certificate mismatch with Consul Connect

We run all of our applications under Nomad, leveraging Consul Connect for communications and Traefik as our ingress. Occasionally when hitting a valid URL, traefik is checking the mTLS certificate against an incorrect name.

Example: a user attempts to hit https://correct-service-name.ourdomain.com, which traefik has a router for to route to the service 'correct-service-name'

The error we are seeing:

2024-08-02T22:05:46Z DBG github.com/traefik/traefik/v3/pkg/server/service/proxy.go:100 > 500 Internal Server Error error="peer certificate mismatch got spiffe://7bd56483-8258-cd69-3a9b-0a2a8d04797f.consul/ns/default/dc/main/svc/correct-service-name, want spiffe:///ns/default/dc/main/svc/incorrect-service-name"

Both 'correct-service-name' and 'incorrect-service-name' are actual services running in our Nomad environment and have registered service entries in the Consul Catalog. Both have router entries in Traefik that match the form: https://SVC.ourdomain.com. What I can't figure out is where Traefik is getting the incorrect name from. It doesnt have the consul UUID in it, and it's for the wrong service. This is also very intermittent. Sometimes the service will work flawlessly, other times Traefik will return an HTTP 500 to the client with no rhyme or reason.

Can anyone point me in the right direction on this? I'm at a loss as to where the problem may lie and how to diagnose.

Answering my own question, for anyone that stumbles across this. The problem was the result of an upstream but in goland HTTP/2 support. Set tls.options.default.alpnProtocols to [ http/1.1 ] to disable http/2 to fix it. Reference: Upstream golang HTTP2 bug hangs chromium-based browsers · Issue #7953 · traefik/traefik · GitHub for more information

1 Like