Cant get the HTTP out of træfik

My goal is to use traefik to serve both HTTPS and MQTTS, to have traefik terminate the TLS using letsencrypt certificates. But for this post I have removed as much I could to emphasis the problem I face:

It seems traefik will always fall back to HTTP if it can. Here is the configuration to demonstrate:

My docker-compose.yaml

version: '3'
services:
    traefik:
        image: traefik:v2.2
        ports:
          - 8883:8883
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock:ro
          - ./traefik.yaml:/traefik.yaml:ro

My traefik.yaml

providers:
    docker:
        exposedByDefault: false
entrypoints:
    mqtts:
        address: :8883

Bring it up and then try to connect with a telnet client. Hit enter a few times to provoke traefik:

$ telnet localhost 8883
Trying ::1...
Connected to localhost.
Escape character is '^]'.

HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Connection: close

400 Bad RequestConnection closed by foreign host.

Yes, it's indeed a bad HTTP request, but I didn't expect it to respond HTTP at all. Remember we haven't instructed traefik anything.

When I put all the pieces back together this unexpected HTTP response confuses my MQTT clients, which is why I would rather have been without this default HTTP stuff.

Can I instruct traefik to not throw my traffic to it's HTTP engine if it not immediately understand the data?

Can you update your post with the rest of the config/labels for the routers of your service?

Sure, here goes: (I have replaced the domain name with example.com)

My docker-compose.yaml

version: '3'
services:
    traefik:
        image: traefik:v2.2
        ports:
          - 80:80
          - 8080:8080
          - 8883:8883
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock:ro
          - ./traefik.yaml:/traefik.yaml:ro
          - ./letsencrypt:/letsencrypt
    mosquitto:
        image: eclipse-mosquitto
        labels:
          - traefik.enable=true
          - traefik.tcp.routers.mosq_r.entryPoints=mqtts
          - traefik.tcp.routers.mosq_r.rule=HostSNI(`mosquitto.example.com`)
          - traefik.tcp.routers.mosq_r.tls=true
          - traefik.tcp.routers.mosq_r.tls.certresolver=letsencrypt
          - traefik.tcp.routers.mosq_r.service=mosq_s
          - traefik.tcp.services.mosq_s.loadbalancer.server.port=1883

And my traefik.yaml

log:
    level: DEBUG
api:
    dashboard: true
    insecure: true
providers:
    docker:
        exposedByDefault: false
entrypoints:
    plain:
        address: :80
    traefik:
        address: :8080
    mqtts:
        address: :8883
certificatesResolvers:
    letsencrypt:
        acme:
            email: spamtrap@example.com
            storage: /letsencrypt/acme.json
            httpChallenge:
                entryPoint: plain

Fire it up and lets inspect the certificate:

$ openssl s_client -connect mosquitto.morten-guldager.dk:8883 -showcerts </dev/null | openssl x509 -noout -dates
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = mosquitto.example.com
verify return:1
DONE
notBefore=Apr 13 14:30:27 2020 GMT
notAfter=Jul 12 14:30:27 2020 GMT

So the endpoint indeed speaks TLS. And the traefik and mosquitto logs reveals activity as well. Of course mosquitto cant make any sense of it.

Now lets try the crazy telnet:

$ telnet mosquitto.example.com 8883
Trying xx.xx.xx.xx...
Connected to mosquitto.example.com.
Escape character is '^]'.

HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Connection: close

400 Bad RequestConnection closed by foreign host.

OK, so now we are speaking plain HTTP.

Finally lets try the mosquitto_pub client:

$ mosquitto_pub --cafile /etc/ssl/certs/ca-certificates.crt -h mosquitto.example.com -p 8883 -t foo -m bar
Error: A TLS error occurred.

My guess is that mosquitto_pub tries to negotiate the connection and this somehow collides with traefik's attempt to do the same.

The config parses okay to me at first glance.

Telnet just ain't going to cut it for testing. The client needs to send the TLS Helo with the server name to hit your router.

If you use openssl s_client can you send commands to moquitto? (I don't know if this is possible with this protocol)

Can you use the debug flag on moquitto_pub and ser if there is more... useful information.

1 Like

Thanks, it has taken me quite a fight to get to this point. I must admit the new v2 configuration is magnitudes harder to me than the old v1.7

With you nodding to the config I dug deeper and with the help of wireshark I inspected TLS headers, comparing between openssl tools and mosquitto_pub I discovered the TLS extension server_name is missing when using mosquitto tools.

A quick search unearth that SNI support has been added just a few years back, and that the versions I run are ancient...
Testing with a modern set of mosquitto tools and it works. BAM!

Now unfortunately many software stacks are bound with the older mosquitto libraries so I will have to leave SNI behind all together.

I have seen others use

HostSNI(`*`)

to make it accept every host (not sure if I got this right)
But I cant figure out how to tell it which certificate to use. How do I specify the host name?

I am fully aware that without SNI I can only serve a specific host name on a specific port, that's ok.

It is hard won knowledge. But it is flexible and we get tcp and udp.

Huzzah!

Ouch, that hurts. Hitting that pain point with other service/libraries myself.

I'm not sure, but try the tls.domain option(s)

- traefik.tcp.routers.mosq_r.tls.domains[0].main=mosquitto.example.com

Well it seems it did half of it. Now traefik knows which host name to serve. The last two lines of the traefik log reveals

traefik_1    | time="2020-04-13T20:14:45Z" level=debug msg="Looking for provided certificate(s) to validate [\"mosquitto.example.com\"]..." providerName=letsencrypt.acme
traefik_1    | time="2020-04-13T20:14:45Z" level=debug msg="No ACME certificate generation required for domains [\"mosquitto.example.com\"]." providerName=letsencrypt.acme

Next I test wit the openssl tools.

$ openssl s_client -connect mosquitto.example.com:8883 -showcerts </dev/null | openssl x509 -noout -dates
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = mosquitto.example.com
verify return:1
DONE
notBefore=Apr 13 18:59:03 2020 GMT
notAfter=Jul 12 18:59:03 2020 GMT

If I test with a different host name, which still points to the same IP:

$ openssl s_client -connect foo.example.com:8883 -showcerts </dev/null | openssl x509 -noout -dates
depth=0 CN = TRAEFIK DEFAULT CERT
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 CN = TRAEFIK DEFAULT CERT
verify error:num=21:unable to verify the first certificate
verify return:1
DONE
notBefore=Apr 13 20:14:45 2020 GMT
notAfter=Apr 13 20:14:45 2021 GMT

Not really surprised here, but still I had hoped both requests had returned the same certificate.

Finally, lets try the mosquitto_pub command again, the antique version. It as fails the same way as previously.

The traefik log reveals two lines that might give a clue:

traefik_1    | time="2020-04-13T20:28:32Z" level=debug msg="Serving default certificate for request: \"\""
traefik_1    | time="2020-04-13T20:28:32Z" level=error msg="Error during connection: readfrom tcp 172.30.11.130:56996->172.30.11.131:1883: remote error: tls: unknown certificate authority"

So even if we have told traefik which host name this route is for, it still wont pull the right ACME certificate.

I'm not entirely sure if it is the same issue discussed two years ago over here: TLS handshake without SNI delivers TRAEFIK DEFAULT CERT instead ACME generated certificate · Issue #2829 · traefik/traefik · GitHub

You need to add -servername with openssl to set the servername.

Edit: I guess not

Might be only can handle pure TCP with HostSNI(`*`)?

Ah yes, I could have used -servername instead in openssl test #2. The #2 test were only to prove that the

- traefik.tcp.routers.mosq_r.tls.domains[0].main=mosquitto.example.com

line at all had any effect.

Hmm, that would be very annoying. I will give this thread a few days. If nobody come up with a solution I might make a new thread. Thanks to your help I will now be able to make it much more to the point, even a meaningful subject will be within reach :slight_smile: