Problems provisioning a Let's Encrypt certificate with Route53 and Helm Chart

I am using AWS EKS with Traefik as the ingress controller.
I am using HELM to install Traefik.
I am trying to create a wildcard certificate using Route53 as the provider.
However, nothing happens, and the acme.json file remains empty.
Below are snippets of the code, I am using Terraform.

Reference: traefik-helm-chart/traefik/values.yaml at master · traefik/traefik-helm-chart · GitHub

image:
  registry: docker.io
  repository: traefik
  pullPolicy: IfNotPresent

deployment:
  enabled: true
  kind: Deployment
  replicas: 1
  terminationGracePeriodSeconds: 60
  minReadySeconds: 10
  annotations: {}
  labels: {}
  podAnnotations: {}
  podLabels: {}
  additionalContainers: []
...
  initContainers:
    - name: volume-permissions
      image: busybox:latest
      command:
        [
          "sh",
          "-c",
          "touch /data/acme.json; chown 65532 /data/acme.json; chmod -v 600 /data/acme.json",
        ]
      securityContext:
        runAsNonRoot: false
        runAsGroup: 0
        runAsUser: 0
      volumeMounts:
        - name: data
          mountPath: /data
...
  websecure:
    port: 8443
    expose:
      default: true
    exposedPort: 443
    protocol: TCP
    forwardedHeaders:
      insecure: true
    http3:
      enabled: false
    tls:
      enabled: true
      domains:
        - main: "*.meu.domain.com"
          sans:
            - meu.domain.com

...
tlsOptions:
  mtls:
    minVersion: VersionTLS12
    clientAuth:
      clientAuthType: RequireAndVerifyClientCert
      caFiles:
        - /certs/ca.crt
  standard:
    minVersion: VersionTLS12
...
persistence:
  enabled: true
  name: data
  accessMode: ReadWriteOnce
  storageClass: gp2
  volumeName: traefik-data
  size: 128Mi
  path: /data
  annotations: {}

certResolvers:
  letsencrypt:
    email: meuemail@domain.com.br
    storage: /data/acme.json
    dnsChallenge:
      provider: route53
      delayBeforeCheck: 0

Terraform

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.23.0"

  cluster_name    = local.cluster_name
  cluster_version = var.cluster_version

  vpc_id     = lookup(data.terraform_remote_state.infra.outputs.eks_vpc, "id")
  subnet_ids = lookup(data.terraform_remote_state.infra.outputs.eks_vpc, "private_subnets_ids")


  cluster_addons_timeouts = {
    create = "30m"
    update = "30m"
    delete = "30m"
  }

  authentication_mode = "API_AND_CONFIG_MAP"

  enable_irsa = true # habilita o IAM Roles for Service Accounts (IRSA) no cluster EKS

  cluster_endpoint_public_access  = true
  cluster_endpoint_private_access = true

  enable_cluster_creator_admin_permissions = true

  cluster_addons = {
    coredns = {
      most_recent = true
    }
    vpc-cni = {
      most_recent = true
    }

    aws-ebs-csi-driver = {
      most_recent              = true
      service_account_role_arn = "arn:aws:iam::${var.account_id}:role/AmazonEKS_EBS_CSI_DriverRole"
    }
  }

  iam_role_additional_policies = {
    AmazonEKSServicePolicy = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"
  }

  eks_managed_node_group_defaults = {
    disk_size = 30

    tags = merge({
      "k8s.io/cluster-autoscaler/${local.cluster_name}" = "owned"
      "k8s.io/cluster-autoscaler/enabled"               = "TRUE"
    }, var.tags)
  }

  eks_managed_node_groups = {
    ng1 = {
      name          = "${replace(var.instance_type, ".", "-")}-nodes"
      instance_type = var.instance_type

      min_size     = 1
      max_size     = var.max_size
      desired_size = var.desired_size

      iam_role_additional_policies = {
        AmazonEBSCSIDriverPolicy = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
      }

      capacity_type = "ON_DEMAND"

      tags = merge({
        "Name" = "ng1-${local.cluster_name}"
      }, var.tags)

    }
  }

  node_security_group_additional_rules = {
    egress_cluster_9443 = {
      description                   = "Node groups to cluster webooks"
      protocol                      = "tcp"
      from_port                     = 9443
      to_port                       = 9443
      type                          = "egress"
      source_cluster_security_group = true
    }

    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
  }

  tags = var.tags
}
 kubectl exec -n traefik traefik-7f7f8f59bb-rdfls -- ls -l /data
Defaulted container "traefik" out of: traefik, volume-permissions (init)
total 0
-rw-------    1 65532    root             0 Aug 13 23:50 acme.json

The AmazonEKS_EBS_CSI_DriverRole also has permissions for access to Route53, but nothing appears in the log, no errors. The acme.json file is zero bytes. Has anyone managed to get this configuration to work this way?

Usually cert-manager is used for TLS with k8s, see Traefik blog article (link).

1 Like

I read the documentation and was able to make progress on building my Terraform setup. However, I'm having trouble generating the certificate. I'll show parts of the cert-manager and the error, as I'm still trying to figure it out.

#create namespace for cert mananger
resource "kubernetes_namespace" "cert_manager" {
  metadata {
    name = "cert-manager"
  }
}

resource "helm_release" "cert_manager" {
  chart            = "cert-manager"
  name             = "cert-manager"
  create_namespace = false
  namespace        = kubernetes_namespace.cert_manager.metadata[0].name
  repository       = "https://charts.jetstack.io"
  version          = "v1.15.2"
  force_update     = true
  wait             = true

  set {
    name  = "crds.enabled"
    value = "true"
  }

  set {
    name  = "serviceAccount.name"
    value = kubernetes_service_account_v1.cert_manager.metadata[0].name
  }

  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = aws_iam_role.cert_manager_role.arn
  }

  set {
    name  = "serviceAccount.create"
    value = "false"
  }

  set {
    name  = "securityContext.enabled"
    value = true
  }

  set {
    name  = "securityContext.fsGroup"
    value = 1001
  }
}

resource "kubernetes_service_account_v1" "cert_manager" {
  metadata {
    name      = "cert-manager"
    namespace = kubernetes_namespace.cert_manager.metadata[0].name
    annotations = {
      "eks.amazonaws.com/role-arn" = aws_iam_role.cert_manager_role.arn
    }
  }
}

data "aws_iam_policy_document" "cert_manager_trust" {
  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]

    principals {
      type        = "Federated"
      identifiers = [var.oidc_provider_arn]
    }

    condition {
      test     = "StringEquals"
      variable = "${var.oidc_provider}:aud"
      values   = ["sts.amazonaws.com"]
    }

    condition {
      test     = "StringEquals"
      variable = "${var.oidc_provider}:sub"
      values   = ["system:serviceaccount:cert-manager:cert-manager"]
    }
  }
}

resource "aws_iam_role" "cert_manager_role" {
  name        = "cert-manager-role"
  description = "eks cert-manager"

  assume_role_policy = data.aws_iam_policy_document.cert_manager_trust.json
}

data "aws_iam_policy_document" "cert_manager_policy" {
  statement {
    effect = "Allow"
    actions = [
      "route53:GetChange",
    ]
    resources = ["arn:aws:route53:::change/*"]
  }

  statement {
    effect = "Allow"
    actions = [
      "route53:ChangeResourceRecordSets",
      "route53:ListResourceRecordSets"
    ]
    resources = [
      "arn:aws:route53:::hostedzone/*"
    ]
  }

  statement {
    effect = "Allow"
    actions = [
      "route53:ListHostedZonesByName"
    ]
    resources = [
      "*",
    ]
  }
}

resource "aws_iam_role_policy" "cert_manager" {
  role   = aws_iam_role.cert_manager_role.name
  name   = "cert-manager"
  policy = data.aws_iam_policy_document.cert_manager_policy.json
}

resource "kubernetes_role_v1" "tokenrequest_role" {
  metadata {
    name      = "${kubernetes_service_account_v1.cert_manager.metadata[0].name}-tokenrequest"
    namespace = kubernetes_namespace.cert_manager.metadata[0].name
  }

  rule {
    api_groups     = [""]
    resources      = ["serviceaccounts/token"]
    resource_names = [kubernetes_service_account_v1.cert_manager.metadata[0].name]
    verbs          = ["create"]
  }
}

resource "kubernetes_role_binding_v1" "tokenrequest_rolebinding" {
  metadata {
    name      = "cert-manager-${kubernetes_service_account_v1.cert_manager.metadata[0].name}-tokenrequest"
    namespace = kubernetes_namespace.cert_manager.metadata[0].name
  }

  subject {
    kind      = "ServiceAccount"
    name      = "cert-manager"
    namespace = kubernetes_namespace.cert_manager.metadata[0].name
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "Role"
    name      = kubernetes_role_v1.tokenrequest_role.metadata[0].name
  }
}

ClusterIssuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-dns
spec:
  acme:
    email: your@main.com
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-dns
    solvers:
      - dns01:
          route53:
            region: sa-east-1
            hostedZoneID: XXXXXXXXXXXX
            role: arn:aws:iam::00000000000:role/cert-manager-role

Log

I0816 00:43:32.143920 1 controller.go:435] "serving insecurely as tls certificate data not provided" logger="cert-manager.controller"
I0816 00:43:32.143933 1 controller.go:102] "listening for insecure connections" logger="cert-manager.controller" address="0.0.0.0:9402"
I0816 00:43:32.144233 1 controller.go:178] "starting leader election" logger="cert-manager.controller"
I0816 00:43:32.144276 1 controller.go:127] "starting metrics server" logger="cert-manager.controller" address="[::]:9402"
I0816 00:43:32.144304 1 controller.go:171] "starting healthz server" logger="cert-manager.controller" address="[::]:9403"
I0816 00:43:32.146222 1 leaderelection.go:250] attempting to acquire leader lease kube-system/cert-manager-controller...
I0816 00:43:58.475886 1 leaderelection.go:260] successfully acquired lease kube-system/cert-manager-controller
I0816 00:43:58.478920 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="certificates-revision-manager"
I0816 00:43:58.480915 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="certificaterequests-issuer-venafi"
I0816 00:43:58.482805 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="clusterissuers"
I0816 00:43:58.485375 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="issuers"
I0816 00:43:58.487276 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="challenges"
I0816 00:43:58.488881 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="certificaterequests-approver"
I0816 00:43:58.490457 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="certificates-issuing"
I0816 00:43:58.492238 1 controller.go:248] "starting controller" logger="cert-manager.controller" controller="certificates-metrics"

Certificante.yaml

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard-domain-tls
  namespace: default
spec:
  secretName: wildcard-domain-tls
  dnsNames:
    - "*.jumeci.com"
  issuerRef:
    name: letsencrypt-dns
    kind: ClusterIssuer

Error

E0816 01:42:22.745709       1 controller.go:162] "re-queuing item due to error processing" err="error instantiating route53 challenge solver: unable to assume role: operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 28563191-37a5-4b88-8af6-8b34a7879ca6, api error AccessDenied: User: arn:aws:sts::00000000000:assumed-role/cert-manager-role/1723772542230200959 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::000000000000:role/cert-manager-role" logger="cert-manager.controller" key="default/wildcard-domain-tls-1-3859137268-3307597572"

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.