Issues with TLS Enabled TCP Router to OpenLDAP

Hello there,

I have encountered a strange behavior of my traefik2 setup when proxying via a tcp router to an OpenLDAP server and wanted to share my struggles here before creating an issue on Github. Maybe I'm just too stupid to get this configured properly :slightly_smiling_face:

This all is on traefik version 2.1.1 which is running in a docker container.
The main parts of the traefik.yaml:

entryPoints:
  ldap-entrypoint:
    address: ":637"

providers:
  docker:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false
    network: web

certificatesResolvers:
  leresolver:
    acme:
      email: "letsencrypt@example.com"
      storage: "acme.json"
      tlsChallenge: {}

And here ldap/docker-compose.yml defining the OpenLDAP container:

version: '3'

services:
    ldap:
        image: osixia/openldap:1.2.2
        restart: always
        environment:
            - LDAP_ORGANISATION=example
            - LDAP_DOMAIN=example.com
            - LDAP_ADMIN_PASSWORD=example
            - LDAP_TLS_VERIFY_CLIENT=try
            - LDAP_TLS_CRT_FILENAME=fullchain.cer
            - LDAP_TLS_KEY_FILENAME=example.key
            - LDAP_TLS_CA_CRT_FILENAME=ca.cer
        ports:
            - "636:636"
        expose:
            - 389
        volumes:
            - sso_ldap:/var/lib/ldap
            - sso_slapd:/etc/ldap/slapd.d
            - sso_certs:/container/service/slapd/assets/certs
        networks:
            - web
        labels:
            - "traefik.enable=true"
            - "traefik.docker.network=web"
            - "traefik.tcp.routers.app-sso-ldap.rule=HostSNI(`ldap.example.com`)"
            - "traefik.tcp.routers.app-sso-ldap.entrypoints=ldap-entrypoint"
            - "traefik.tcp.routers.app-sso-ldap.tls=true"
            - "traefik.tcp.routers.app-sso-ldap.tls.certresolver=leresolver"
            - "traefik.tcp.routers.app-sso-ldap.service=app-sso-ldap"
            - "traefik.tcp.services.app-sso-ldap.loadbalancer.server.port=389"

This runs perfectly fine with tls using the published port 636. The non-tls port 389 is exposed and configured as the service port for traefik to connect to.

Now I verified, that traefik successfully created a valid certificate and terminates a tls connection properly with openssl:

>openssl s_client -connect ldap.example.com:637
CONNECTED(00000003)
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = ldap.example.com
verify return:1
---
Certificate chain
 0 s:CN = ldap.example.com
   i:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
 1 s:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
   i:O = Digital Signature Trust Co., CN = DST Root CA X3
---
Server certificate
-----BEGIN CERTIFICATE-----
XXXXXXXXXXXXXXXXXXXX
-----END CERTIFICATE-----
subject=CN = ldap.example.com

issuer=C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3

---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 3636 bytes and written 406 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
Server public key is 4096 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
---
Post-Handshake New Session Ticket arrived:
SSL-Session:
    Protocol  : TLSv1.3
    Cipher    : TLS_AES_256_GCM_SHA384
    Session-ID: XXX
    Session-ID-ctx: 
    Resumption PSK: XXX
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket lifetime hint: 604800 (seconds)
    TLS session ticket:
    XXX

    Start Time: 1579300651
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: no
    Max Early Data: 0
---
read R BLOCK
^C

Which results in following openldap log output:

5e2236c6 conn=1000 fd=12 ACCEPT from IP=172.18.0.2:33156 (IP=0.0.0.0:389)
5e2236c7 conn=1000 fd=12 closed (connection lost)

This indicates that the tls connection can be opened and should be configured properly, also the certificate should be accepted as I already verified letsencrypt certificates on ldap servers as working.

But the debug log of ldapwhoami show following on a connection attempt to the same port as openssl:

> ldapwhoami -v -d 2 -H ldaps://ldap.example.com:637/ -D cn=user,ou=people,dc=example,dc=com -w example
ldap_initialize( ldaps://ldap.example.com:637/??base )
tls_write: want=293, written=293
  ...
TLS certificate verification: Error, unable to get local issuer certificate
tls_write: want=7, written=7
  0000:  15 03 03 00 02 02 30                               ......0           
TLS: can't connect: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed (unable to get local issuer certificate).
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)

Additionally no output is generated in the ldap logs, so this connection is definitely being killed by traefik somehow.

Any clue what could have lead to this?
If anything is missing for further analysis I am happy to provide more information.

If this is not an issue with traefik itself I would try to address this issue somewhere else.

Otherwise thanks in advance for any hint and best regards,

djesionek

Very happy you have posted this @djesionek; I don't have a solution, but i am trying to solve the same problem, so I am hoping others can help shed some light.

1 Like

Looks like we alone on this one @djesionek :slight_smile:

Couple of observations; I suppose you have noticed you are using port 637 in your config, is that correct, or a typo?

Also I note in my https services config (which work) the address / url of the destination is green, while the address for the destination tcp is red indicating an error.

I am not sure how to debug this further but wondered if you have the same red destination address?

Hello @djesionek,

I notice that you have configured TLS verification for your LDAP via:

            - LDAP_TLS_VERIFY_CLIENT=try
            - LDAP_TLS_CRT_FILENAME=fullchain.cer
            - LDAP_TLS_KEY_FILENAME=example.key
            - LDAP_TLS_CA_CRT_FILENAME=ca.cer

But I don't see where you provide those certificates.

Also, can you confirm which certificate it is attempting to verify?

I had assumed traefik would be terminating the ssl connection instead of using these local certs, but I am likely wrong with that.

The connection does not work for me, with or without those local certs.

Traefik created the cert for openldap and stores in in my cert store, but as that is a json, I am unable to link the individual files that openldap requires.

Telling openldap to use just my CA cert borks the openldap config too.

Appreciate I might be rambling but I wanted to share my testing to date :slight_smile:

Hi @daniel.tomcej,

thankfully not @mannp :smiley: .

That is in fact intentional. I only posted the cooked down version of what matters. In fact I have port 636 directly published via docker, 637 is intended for testing the TCP router and is thus published by the traefik2 container and 638 is the unencrypted connection just for double checking that this actually works as traefik2 should route to the unencrypted port. The alignment is purely for practical purposes (a.k.a laziness :wink: ).

I suppose you mean what can be seen here in the second screenshot? That would also be the case for my setup. I wonder if there is any documentation on these dashboard elements as I couldn't find any explanation for this now.
At least something from the connection seems to get through as I found out from my initial testing described in the first post.

If I get to it it may be good to change the debug log level until something suspicious shows up in the traefik logs, also that could be necessary for the other tools in this chain (OpenLDAP itself, ldapwhoami tool ...).

The option value try tells, that OpenLDAP should at least try to verify the clients identity against a CA [1]. Of course a client would be required to be configured specially for that too (supposedly just like a server, never tried client verification with ssl though).
Maybe this "try" is implemented in a way that traefik can not handle. Will check changing this to "never" instead and work with that so there is less to go wrong.

Actually OpenLDAP uses the certificates I specified there for terminating ssl on port 636. Traefik should definitely go against the unencrypted port which is usually 389 for ldap, at least in that case and to prevent further problem sources.
My configuration for traefik defines that traefik terminates the ssl connection and an unencrypted connection should be established on 389, which also seems to work partly as seen in my first post.
The certificates obtained by traefik should therefore not be needed to be imported/configured into OpenLDAP in any way.
Did I understand your point correctly here @mannp ?

EDIT:
I just noticed, that on one hand the openssl command returns the correct LE certificate but Apache-Directory-Studio does show a warning, that the server uses a self-signed certificate instead (the "DEFAULT TRAEFIK" one). Quite strange that this is handled somehow differently. ATM I can not explain why this is.

After poking around in the communication traces via Wireshark I finally found the source of our problems. Traefik relies on the SNI [1] to track and route connections to services [2]. I could verify that neither Apache-Directory-Studio nor the OpenLDAP tools use this optional TLS extension, so this configuration can not work properly.
This is supported by the observation that openssl managed to cause OpenLDAP to log a connection and received the correct certificate while the ldap tools just got the default certificate and canceled the connection regardless of the certificate setting (ignore insecure or not).

I suppose it would be a valid solution for anyone trying to implement OpenLDAP behind traefik to route all incoming connections on a given port to the service. In that way it would at least be possible to offload certificate renewal and termination. Now I tried following configuration change:

-            - "traefik.tcp.routers.app-sso-ldap.rule=HostSNI(`ldap.example.com`)"
+            - "traefik.tcp.routers.app-sso-ldap.rule=HostSNI(`*`)"

Here I would expect that a connection should be possible and a connection is being logged which isn't the case yet unfortunately...
Maybe this is the reason why:

It is important to note that the Server Name Indication is an extension of the TLS protocol. Hence, only TLS routers will be able to specify a domain name with that rule. However, non-TLS routers will have to explicitly use that rule with * (every domain) to state that every non-TLS request will be handled by the router. [3]

I understand that text (especially the part "to state that every non-TLS request will be handled by the router."), that such catch-all rules only apply to non-TLS connections coming in.
Does anyone know a solution that would enable traefik to use a default certificate generated through letsencrypt with a custom hostname (of course one that can be validated by traefik) on such wildcard rules?

Not sure if this question even is in scope of this thread. I could also create a new one or even an github issue requesting this as a feature?

EDIT: Of course it would be an alternative to request the SNI extension for OpenLDAP. I think both, the possibility to handle such things in traefik and a proper OpenLDAP implementation of SNI, are useful or even necessary...
Though I think extending traefik would be a quicker solution for users, as an update of OpenLDAP client tools would not be enough to fix all issues regarding that. Potentially all applications communicating with the ldaps protocol would need to be updated for that.

Thanks for the detailed response, I believe I understand your setup more now, as you seem to have a dual setup using local certs on port 636 and getting traefik to terminate to 389.

Your explanation above is exactly how I understood things to be working, but your use of local certs for openldap is what threw me :slight_smile:

I am just using traefik2 to try and terminate the ssl and pass the unencrypted through to 389.

Have you tried passthrough mode here, not sure it is what is needed or indeed how it works in this case.

Okay yes. my test also gave a successful connection with openssl, but a quickly disconnected one with the openldap tools.

I currently use HostSNI(ldap.home.local) as the full fqdn resolving to the address of the ldap server and I get a cert from my local acme server no problem; stored in the traefik cert store.

That said I am unable to connect to that address successfully, but likely due to the reason you suggest.

Is there another LDAP implementation perhaps that does implement SNI (better)? 389 Directory Server perhaps....

Perhaps others in the thread can assist with the best way forward from a traefik2 point of view, but for me, being able to terminate openldap is a great feature, if its possible.

Regarding the screenshots, yes that is exactly it that I get a red triangle as you do and I too have not found any docs explaining this or indeed explaining how to format the address or url, not sure which it is in the traefik2 yaml config file.

I did wonder weather we should be specifying ldap://: similar to that of ws://: but couldn't find any docs, so assumed it was address and port only for TLS connections.

That would be a way to just get the connections through to the OpenLDAP server which would still terminate the connection. This would still require separate certificate management on the server side which is exactly what I would like to replace.

The issue here is the protocol implementation on the client side, so maybe 389 provides client cli tools that could use the SNI extension. But this would not solve the problem for all applications that might use ldap for authentication and such.

I still think that the change I suggested would be a good feature for traefik. Many applications using ssl encryption over a protocol other than https for example might have the same issues, that could be solved like that as they also often use other default ports anyway.
Maybe I should look into the code myself at some point but I'm not quite fluent in go development.

IP address and port should be enough for tcp at least. Examples in the documentation also just show IP addresses for configuration of tcp based services. On the other side this should not be necessary when using the docker integration, I assume you use another discovery method or just configuration files?

I see now, thanks for taking the time to teach :slight_smile:

Hopefully someone will comment.

Not using docker compose makes adding labels to each docker config a pain, so I configured an optimum https config and copy paste the sections in the traefik config for each http server I wish to add https.

Hi @djesionek did you manage to make any further progress on this one?

I just wanted to thank you both for this. I've been trying to terminate ldaps requests from Fusion Directory at port 636 with Traefik 2.3, and route to the OpenLDAP container on 389. I especially appreciate @djesionek poking around with Wireshark to explain not only why it doesn't work, but why it can't under present circumstances.

I will have to find another option. Thanks again to you both.

Great discussion here, but just to clarify some things:

  • Simultaneously routing, with real certificates, to LDAPI/LDAPS works fine through Traefik.
  • OpenLDAP does not support HostSNI(ldap.domain) -- only HostSNI(*). I am not sure why one would care.
  1. Be sure your ports/entrypoints are being opened on the host via Traefik (not via your ldap container)
services:
  traefik:
    image: traefik:v2.5
    ports:
      # Listen on port 80, default for HTTP, necessary to redirect to HTTPS
      - target: 80
        published: 80
        mode: host
      # Listen on port 443, default for HTTPS
      - target: 443
        published: 443
        mode: host
      # Listen on ports 389/636, LDAP and LDAPS
      - target: 389
        published: 389
        mode: host
      - target: 636
        published: 636
        mode: host
    command:
      - --entrypoints.ldap-tcp.address=:389/tcp
      - --entrypoints.ldap-udp.address=:389/udp
      - --entrypoints.ldaps.address=:636/tcp

  1. Square up your certbot certificates. I use a DNS resolver to grab a wildcard:
  • Note: The first time you deploy this container and the certbot-certificates volume is created, you need to manually access the volume and chmod 755 the live and archive directories; also chmod 644 privkey.pem for good measure. This change will persist through renewals.
services:
  certbot:
    image: certbot/dns-route53
    command: certonly -n --dns-route53 -d *.domain.tld -m foo@bar --agree-tos --server https://acme-v02.api.letsencrypt.org/directory
    volumes:
      - certbot-certificates:/etc/letsencrypt
      - certbot-data:/var/lib/letsencrypt
    environment:
      - DOMAIN=*.domain.tld
      - EMAIL=foo@bar
      - TZPATH=America/Chicago
      - AWS_ACCESS_KEY_ID=foo
      - AWS_SECRET_ACCESS_KEY=bar
      - AWS_DEFAULT_REGION=us-east-1
    deploy:
      restart_policy:
        condition: any
        delay: 24h
  chainchecker:
    image: superseb/cert-check
    volumes:
      - /mnt/docker/volumes/ops_certbot-certificates/_data/live/domain.tld/cert.pem:/certs/cert.pem
      - /mnt/docker/volumes/ops_certbot-certificates/_data/live/domain.tld/privkey.pem:/certs/key.pem
      - /mnt/docker/volumes/ops_certbot-certificates/_data/live/domain.tld/cacert.pem:/certs/cacerts.pem
      - /mnt/docker/volumes/ops_certbot-certificates/_data/live/domain.tld:/cert-check
    command: cd /certs ; curl -o root.pem https://letsencrypt.org/certs/isrg-root-x1-cross-signed.pem ; cat /certs/chain.pem /certs/root.pem > cacerts.pem ; domain.tld resolv
    deploy:
      restart_policy:
        condition: any
        delay: 24h

  1. Pass good certs straight into openldap:
services:
  openldap:
    image: osixia/openldap:latest
    environment:
      - LDAP_TLS_CRT_FILENAME=live/domain.tld/fullchain.pem
      - LDAP_TLS_KEY_FILENAME=live/domain.tld/privkey.pem
      - LDAP_TLS_CA_CRT_FILENAME=live/domain.tld/root.pem
    volumes:
      - openldap_data:/var/lib/ldap
      - slapd_data:/etc/ldap/slapd.d
      - certbot-certificates:/container/service/slapd/assets/certs/
    deploy:
      labels:
        - traefik.enable=true
        - traefik.docker.network=traefik-public
        - traefik.constraint-label=traefik-public

        # LDAP
        - traefik.tcp.routers.ldap-tcp.rule=HostSNI(`*`)
        - traefik.tcp.routers.ldap-tcp.entrypoints=ldap-tcp
        - traefik.tcp.routers.ldap-tcp.service=ldap-tcp-svc
        - traefik.tcp.services.ldap-tcp-svc.loadbalancer.server.port=389/tcp

        - traefik.udp.routers.ldap-udp.entrypoints=ldap-udp
        - traefik.udp.routers.ldap-udp.service=ldap-udp-svc
        - traefik.udp.services.ldap-udp-svc.loadbalancer.server.port=389/udp

        # LDAPS
        - traefik.tcp.routers.ldaps.rule=HostSNI(`*`)
        - traefik.tcp.routers.ldaps.entrypoints=ldaps
        - traefik.tcp.routers.ldaps.tls=true
        - traefik.tcp.routers.ldaps.tls.passthrough=true
        - traefik.tcp.routers.ldaps.tls.certresolver=route53
        - traefik.tcp.routers.ldaps.service=ldaps-svc
        - traefik.tcp.services.ldaps-svc.loadbalancer.server.port=636
    networks:
      - traefik-public

1 Like

@trhr this works great!!!
One Question, can you share your traefik compose too? I want to look into your route53 certresolver part so that i can adapt it to my own needs with inwx

Guys, has anything changed during this time - has the traefik learned to work with such hosts that it would be possible to generate certificates without using a third-party container with a certbot?