Traefik2 dnsChallenge with wild-card domain

I'm trying to convert a working traefik1 config to v2. The concept I used is that all my services (which run in docker) run on http, with traefik applying a wildcard cert obtained via letsencrypt acme dnschallenge.

The idea is that the compose label config for services enabled in traefik should not require any https related config - this should be encapsulated in the static config in the toml/yml files. For all services incoming http should be redirected to https, also in the static cfg.

The working v1 configs looked like this:

debug = true
logLevel = "INFO"
defaultEntryPoints = ["http", "https"]
[entryPoints]
  [entryPoints.traefik]
  address = ":8088"
  [entryPoints.http]
  address = ":80"
    [entryPoints.http.redirect]
    entryPoint = "https"
  [entryPoints.https]
  address = ":443"
  [entryPoints.https.tls]
[traefikLog]
filePath = "traefik.log"
[accessLog]
filePath = "traefik_access.log"
[api]
[ping]
[acme]
email = "mymail@some.domain"
storage = "acme.json"
entryPoint = "https"
acmeLogging = true
[[acme.domains]]
  main = "*.mydomain.com"
[acme.dnsChallenge]
  provider = "ovh"
  delayBeforeCheck = 0
[docker]
endpoint = "tcp://192.168.123.55:2375"
exposedByDefault = false
network = "traefik"

Traefik labels in service compose:

    labels:
      # traefik-v1:
      - "traefik.enable=true"
      - "traefik.frontend.rule=Host:mb.mydomain.com"
      - "traefik.port=8096"
      - "traefik.backend=mb"

I've tried converting this to traefik2. Without certs I have a working setup, but when adding certificateResolvers etc I see nothing related to cert requests in the logs. I have removed the v1 acme.json as it's format seems no longer compatible. Current config looks like this (switched to yml format as this does not repeat base if identifiers):

global:
  checkNewVersion: true
entryPoints:
  web:
    address: ":80"
  web-secure:
    address: ":443"
log:
  level: "DEBUG"
accessLog:
  filePath: "traefik_access.log"
api:
  insecure: true
providers:
  docker:
    endpoint: "tcp://192.168.123.55:2375"
    exposedByDefault: false
    network: "traefik"
certificatesResolvers:
  sample:
    acme:
      email: "mymail@some.domain"
      storage: "acme.json"
      dnsChallenge:
        provider: "ovh"
        delayBeforeCheck: 0

The service compose labels:

    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.mb-server.rule=Host(`mb.mydomain.com`)"
      - "traefik.http.routers.mb-server.entryPoints=web, web-secure"
      - "traefik.http.services.mb-server.loadbalancer.server.port=8096"

The required OVH-specific config is provided via env vars.

What seems to be missing in the v2 cfg is the [[acme.domain]] declaration which provided the wild-card domain in v1.

I'm also unclear what the 'sample' label under certificateProviders should be.

If anybody can point me to the issues in my config I'd be very grateful!!

Hello,

the domains must be define on the router: https://docs.traefik.io/v2.0/routing/routers/#domains

- "traefik.http.routers.mb-server.tls.domains[0].main=*.mydomain.com"

Thanks @Idez that helped me move forward. Something I thought was not clear from the docs is that I also need to set tls.certresolver on the router, and that the value assigned to that must match an entry in certificatesResolvers in the static config. That's the item set tp sample on the docs (which I questioned above. I set these to "letsencrypt".

With those changes I now see a "letencrypt" section with Account info & a private key in acme.json, but Certificates is null. I see DNS challenges in the logs, but these fail:

     labels:
      - "traefik.enable=true"
      - "traefik.http.routers.mb-server.rule=Host(`mb.mydomain.com`)"
      - "traefik.http.routers.mb-server.entryPoints=web, web-secure"
      - "traefik.http.routers.mb-server.tls=true"
      - "traefik.http.routers.mb-server.tls.domains[0].main=mydomain.com"
      - "traefik.http.routers.mb-server.tls.domains[0].sans=*.mydomain.com"
      - "traefik.http.routers.mb-server.tls.certresolver=letsencrypt"
      - "traefik.http.services.mb-service.loadbalancer.server.port=8096"

The resulting log:

DEBU[2019-09-19T12:34:00Z] Domains ["mydomain.com" "*.mydomain.com"] need ACME certificates generation for domains "mydomain.com,*.mydomain.com".  providerName=letsencrypt.acme
DEBU[2019-09-19T12:34:00Z] Loading ACME certificates [mydomain.com *.mydomain.com]...  providerName=letsencrypt.acme
DEBU[2019-09-19T12:34:00Z] Building ACME client...                       providerName=letsencrypt.acme
DEBU[2019-09-19T12:34:00Z] https://acme-v02.api.letsencrypt.org/directory  providerName=letsencrypt.acme
DEBU[2019-09-19T12:34:01Z] Using DNS Challenge provider: ovh             providerName=letsencrypt.acme
DEBU[2019-09-19T12:34:01Z] legolog: [INFO] [mydomain.com, *.mydomain.com] acme: Obtaining bundled SAN certificate 
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [*.mydomain.com] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/*******
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [mydomain.com] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/*******
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [*.mydomain.com] acme: use dns-01 solver 
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [mydomain.com] acme: Could not find solver for: tls-alpn-01 
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [mydomain.com] acme: Could not find solver for: http-01 
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [mydomain.com] acme: use dns-01 solver 
DEBU[2019-09-19T12:34:02Z] legolog: [INFO] [*.mydomain.com] acme: Preparing to solve DNS-01 
DEBU[2019-09-19T12:34:03Z] legolog: [INFO] [mydomain.com] acme: Preparing to solve DNS-01 
DEBU[2019-09-19T12:34:03Z] legolog: [INFO] [*.mydomain.com] acme: Cleaning DNS-01 challenge 
DEBU[2019-09-19T12:34:03Z] legolog: [WARN] [*.mydomain.com] acme: error cleaning up: ovh: unknown record ID for '_acme-challenge.mydomain.com.'  
DEBU[2019-09-19T12:34:03Z] legolog: [INFO] [mydomain.com] acme: Cleaning DNS-01 challenge 
DEBU[2019-09-19T12:34:03Z] legolog: [WARN] [mydomain.com] acme: error cleaning up: ovh: unknown record ID for '_acme-challenge.mydomain.com.'  
ERRO[2019-09-19T12:34:04Z] Unable to obtain ACME certificate for domains "mydomain.com,*.mydomain.com" : unable to generate a certificate for the domains [mydomain.com *.mydomain.com]: acme: Error -> One or more domains had a problem:
[*.mydomain.com] [*.mydomain.com] acme: error presenting token: ovh: error when call api to add record (/domain/zone/mydomain.com/record): json: cannot unmarshal number 5025119774 into Go struct field Record.id of type int
[mydomain.com] [mydomain.com] acme: error presenting token: ovh: error when call api to add record (/domain/zone/mydomain.com/record): json: cannot unmarshal number 5025119775 into Go struct field Record.id of type int  providerName=letsencrypt.acme

The 'Could not find solver' messages suggest there is still a cfg issue, but there also seems to be a Go-typing issue - 5025119775 seems a valid int to me but unmarshalling throws an error.

Is this a bug?

Any other suggestions what might be wrong with the cfg?

Those lines are normal ([INFO]), it's because the acme lib iterate on challenge types from LE.

TLS > HTTP > DNS

The last line is the prove that Traefik use the right solver (DNS)

Ok, that's good info, but what about the unmarshal go-errors? Since there is no 'lego' prefix to the error msg I presume this is coming from traefik? Is it possible that a cert is in fact being returned but the response processing in traefik failing?

This is still working in my traefik1 cfg, so I expect the ovh side to be ok.

are you using a 32bits version or an arm version of Traefik?

Yes - arm7 - it's running in an lxc container on my Turris Omnia router.

Linux xxxxxxxx 4.4.191-a890a5a94ebb621f8f1720c24d12fef1-0 #1 SMP Thu Sep 12 12:58:20 CEST 2019 armv7l Linux

./traefik version
Version:      2.0.0
Codename:     montdor
Go version:   go1.13
Built:        2019-09-16T17:35:11Z
OS/Arch:      linux/arm

Is this a problem?

v2 use the same code as v1 for OVH.

json: cannot unmarshal number 5025119774 into Go struct field Record.id of type int

In this case, the max value of a int:

  • in a 32bits arch is 2147483647
  • in a 64bits arch is 9223372036854775807

So 5025119774 is higher than 2147483647.

Yep, confirmed. I see you created a bug report.

getconf INT_MAX
2147483647

I just did another test: moved acme.json and fired up traefik v1. Now I get the same issue there. I guess the ID numbers have grown > INT_MAX over time - perhaps a DB auto increment ... just speculating ...