Traefik fails to obtain letsencrypt certificate with dns-01 challenge

I've setup a TXT record in my dns and configured traefik with acme with dns-01 challenge. According to the logs, the challenge was succesfully validated and a certificate was issued, however any attempts to connect to my endpoint fail at the SSL handshake.

I'm attaching all logs and overriden helm chart values in a gist:

I'd apreciate if someone could explain to me what is wrong as I can not find a clear explanation

@daniel.tomcej ? any hints?

I'm not a fan of using openssl client, as it can be very finicky with the environment it was run in.

The message SSL routines:ssl23_write:ssl handshake failure usually means that you are using an outdated version of openssl, or its a buggy version.

I prefer to use https://testssl.sh/ instead, as it has more information.

However, I tried to access your domain, but it appears the DNS is not resolving.

From your logs, it appears that the certificate was generated properly, and the server was reloaded, but there is nothing in your logs signifying a failure.

Can you try testssl.sh and curl -v to debug?

Sure, here's testssl.sh:

ATTENTION: No cipher mapping file found!
Please note from 2.9 on testssl.sh needs files in "$TESTSSL_INSTALL_DIR/etc/" to function correctly.

Type "yes" to ignore this warning and proceed at your own risk --> yes

ATTENTION: No TLS data file found -- needed for socket-based handshakes
Please note from 2.9 on testssl.sh needs files in "$TESTSSL_INSTALL_DIR/etc/" to function correctly.

Type "yes" to ignore this warning and proceed at your own risk --> yes

###########################################################
    testssl.sh       3.0rc3 from https://testssl.sh/dev/
    (171abee 2019-07-23 19:19:49 -- )

      This program is free software. Distribution and
             modification under GPLv2 permitted.
      USAGE w/o ANY WARRANTY. USE IT AT YOUR OWN RISK!

       Please file bugs @ https://testssl.sh/bugs/

###########################################################

 Using "OpenSSL 1.0.2n  7 Dec 2017" [~129 ciphers]
 on tsunami:/anaconda3/bin/openssl
 (built: "reproducible build, date unspecified", platform: "darwin64-x86_64-cc")


 Start 2019-07-24 00:28:23        -->> 178.128.142.83:443 (traefik.k8s.cloud-technologies.net) <<--

 rDNS (178.128.142.83):  --

 178.128.142.83:443 doesn't seem to be a TLS/SSL enabled server
 The results might look ok but they could be nonsense. Really proceed ? ("yes" to continue) --> yes
./testssl.sh: line 9429: printf: missing hex digit for \x
./testssl.sh: line 9429: printf: missing hex digit for \x
 Service detected:       Couldn't determine what's running on port 443, assuming no HTTP service => skipping all HTTP checks


 Testing protocols via sockets except NPN+ALPN 

 SSLv2      not offered (OK)
 SSLv3      ./testssl.sh: line 9429: printf: missing hex digit for \x
not offered (OK)
 TLS 1      ./testssl.sh: line 9429: printf: missing hex digit for \x
not offered
 TLS 1.1    ./testssl.sh: line 9429: printf: missing hex digit for \x
not offered
 TLS 1.2    ./testssl.sh: line 9429: printf: missing hex digit for \x
./testssl.sh: line 9429: printf: missing hex digit for \x
not offered
 TLS 1.3    ./testssl.sh: line 9429: printf: missing hex digit for \x
./testssl.sh: line 9429: printf: missing hex digit for \x
not offered

You should not proceed as no protocol was detected. If you still really really want to, say "YES" --> no

Here's curl:

curl -v https://traefik.k8s.cloud-technologies.net
* Rebuilt URL to: https://traefik.k8s.cloud-technologies.net/
*   Trying 178.128.142.83...
* TCP_NODELAY set
* Connected to traefik.k8s.cloud-technologies.net (178.128.142.83) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /anaconda3/ssl/cacert.pem
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to traefik.k8s.cloud-technologies.net:443 
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to traefik.k8s.cloud-technologies.net:443 

Again, Traefik seems to have received a certificate:

time="2019-07-23T21:54:49Z" level=info msg="legolog: [INFO] [traefik.k8s.cloud-technologies.net] acme: Validations succeeded; requesting certificates"
time="2019-07-23T21:55:43Z" level=info msg="legolog: [INFO] [traefik.k8s.cloud-technologies.net] Server responded with a certificate."

I get the same result from my workstation:

$ curl -vk https://traefik.k8s.cloud-technologies.net
* Rebuilt URL to: https://traefik.k8s.cloud-technologies.net/
*   Trying 178.128.142.83...
* TCP_NODELAY set
* Connected to traefik.k8s.cloud-technologies.net (178.128.142.83) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to traefik.k8s.cloud-technologies.net:443 
* stopped the pause stream!
* Closing connection 0
curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to traefik.k8s.cloud-technologies.net:443 

Do you have a firewall or load balancer or IPS or something that could be interfering with external TCP connections?

SYSCALL means that the TCP connection was interrupted.

There’s a loadbalancer in front of the k8s cluster. Traefik is in NodePort mode. Lb is set to: 80->30080 443 ->30443