Streamlining Certificate Management with Let’s Encrypt and Ansible

In this post I’ll show you, with examples, how we’re using Ansible and Route53 to request and renew Let’s Encrypt certificates that we can use to secure our internal tools and APIs at Just Eat. I’ll also describe how we’re automating this with Concourse CI, how we’re storing them securely in AWS Certificate Manager (ACM), and how we’re doing this across our dozens of AWS accounts.

In the beginning…

Our requirement to use Let’s Encrypt arose from a project to implement Hashicorp Vault. We wanted to secure our Vault instances with Transport Layer Security (TLS) certificates so we could have end to end encryption terminated at the Vault EC2 instances. At the time the only available certificates were self signed certificates from an internally managed Certificate Authority (CA). These would have required us to ensure our Vault users had valid root CA certificates installed to avoid warnings in the browser.

Securing our certificates

Secure storage of certificates is critical. We’re utilising AWS Secrets Manager to store the certificates with an appropriate IAM resource policy to ensure certificates are retrieved by authenticated and authorised entities only. We can also see from AWS Cloud Trail when these certificates are retrieved and by whom.

It is important to point out that although it is not shown in the examples below we’re wrapping our ansible tasks in blocks that use no_log. This prevents certificate material being inadvertently leaked into logs.

Orchestration

We’re running all of these tasks in an Ansible role from Concourse CI. Concourse CI will run our Ansible role in an ephemeral container that will be destroyed once we have created and stored the certificate. We run the role on a schedule which generates a new certificate for anything due to expire within the next 10 days.

We use a simple Concourse CI pipeline to run our Ansible role in each of our AWS accounts and ensuring we have consistent certificates everywhere that they’re needed. The pipeline has a git based trigger based for any changes made to the master branch of the Ansible role, and we also have a time based trigger to run the pipeline in all accounts on a fortnightly schedule.

Building out the solution

At Just Eat we use Ansible for our infrastructure provisioning as it gives us a way to simply declare the resources we would like and create them in an idempotent way. Fortunately there are a number of open-source Ansible modules available that make the use of Let’s Encrypt fairly straightforward.
The step by step process we are using to generate a Let’s Encrypt certificate is laid out below.

Generate the ACME account private key

The Automatic Certificate Management Environment (ACME) account private key is used later by the acme_certificate module.

- name: Generate the acme account key
  become: yes
  openssl_privatekey:
    path: "{{ account_key_path }}"
    owner: letsencrypt
    group: letsencrypt
    mode: 0640

Generate the certificate private key

We need a private key that will be used for the certificate we are generating.

- name: Generate the private key
  become: yes
  openssl_privatekey:
    path: "{{ private_key_path }}"
    owner: letsencrypt
    group: letsencrypt
    mode: 0640
    size: 2048

Note: We’re using 2048 for key length because AWS Load Balancers do not support larger keys if the certificate is stored in ACM. See here for more details: https-listener-certificates

Generate the Certificate Signing Request

The Certificate Signing Request (CSR) will be used when requesting the certificate from Let’s Encrypt.

- name: Generate the certificate signing request
  become: yes
  openssl_csr:
    path: "{{ csr_path }}"
    privatekey_path: "{{ private_key_path }}"
    common_name: "{{ dns }}"
    owner: letsencrypt
    group: letsencrypt
    mode: 0640

Create a DNS challenge

This is one of the key tasks in the process. It requests the certificate from Let’s Encrypt and returns information about the DNS records we need to create to prove that we own the domain and allow the request to be validated by Let’s Encrypt.

Let’s Encrypt has a staging environment (https://letsencrypt.org/docs/staging-environment/) that allows you to get the settings right before trying to issue production certificates.

- name: Create a DNS challenge
  become: yes
  acme_certificate:
    account_key_src: "{{ account_key_path }}"
    account_email: "{{ email }}"
    csr: "{{ csr_path }}"
    dest: "{{ cert_path }}"
    challenge: dns-01
    acme_directory: "{{ acme_directory_url }}"
    acme_version: 2
    terms_agreed: yes
    remaining_days: 10
    force: "{{ force }}"
  register: dns_challenge

Note: We use an Ansible var for whether or not to force the dns challenge, but this should be used with care as Let’s Encrypt will rate limit (https://letsencrypt.org/docs/rate-limits/) certificate requests if you request too many certificates with the same parameters in a given time period.

Set some facts…

From the acme_certificate we register the dns_challenge and extract the dns record and value that we need to create in Route53 to have the certificate request validated.

- name: Set dns challenge facts
  when: dns_challenge is changed
  block:
    - name: Set dns_challenge_record fact
      set_fact:
        dns_challenge_record: "{{ dns_challenge.challenge_data[dns]['dns-01'].record }}"

    - name: Set dns_challenge_record fact
      set_fact:
        dns_challenge_value: "{{ dns_challenge.challenge_data[dns]['dns-01'].resource_value }}"

Create the Route 53 record

From here we can create the Route53 record, then wait for it to be created.

- name: Get hosted zone id
  command: |
    aws route53 list-hosted-zones-by-name \
    --dns-name "{{ hosted_zone }}" \
    --query "HostedZones[0].Id" \
    --region {{ region }}
  register: hosted_zones_cmd

- name: Create dns-challenge.json file
  template:
    src: dns-challenge.json.j2
    dest: /tmp/dns-challenge.json

- name: Set record
  command: |
    aws route53 change-resource-record-sets \
       --hosted-zone-id "{{ hosted_zones_cmd.stdout }}" \
       --change-batch file:///tmp/dns-challenge.json \
       --region {{ region }}
  register: record_set_cmd

- name: Wait for record change
  command: |
    aws route53 wait resource-record-sets-changed \
      --id "{{ (record_set_cmd.stdout|from_json).ChangeInfo.Id }}"

Validate and retrieve the certificate

Finally we can validate the certificate challenge and retrieve the certificate and intermediate certificate.

- name: Let the challenge be validated and retrieve the cert and intermediate certificate
  become: yes
  when: dns_challenge is changed
  acme_certificate:
    account_key_src: "{{ account_key_path }}"
    account_email: "{{ email }}"
    csr: "{{ csr_path }}"
    dest: "{{ cert_path }}"
    fullchain_dest: "{{ fullchain_path }}"
    chain_dest: "{{ chain_path }}"
    challenge: dns-01
    acme_directory: "{{ acme_directory_url }}"
    acme_version: 2
    terms_agreed: yes
    remaining_days: 10
    force: "{{ force }}"
    data: "{{ dns_challenge }}"

Renewal

Let’s Encrypt certificates are valid for 90 days, but due to the way the ansible role has been written, the renewal process is trivial. In the code snippet above, you may note that we have remaining days set to 10. This means that we can run the Ansible role when the certificate is due to expire in less than 10 days and the certificate will be automatically renewed.

Next Steps

Our initial goal was to create Let’s Encrypt certificates that we can use for Hashicorp Vault, but it quickly became clear that the Ansible role we created took a generic approach which allows us to extend our Let’s Encrypt usage to Amazon Certificate Manager (ACM) to allow the certificates to be used in Cloud Front, API Gateway, and by AWS managed Load Balancers.

Conclusion

Let’s Encrypt provided a great solution to a number of problems we were facing at Just Eat. We are now able to automate our certificate management and schedule it on a regular basis which mitigates risks and effort associated with expiring certificates. We have also vastly simplified our certificate infrastructure making it much easier for our applications to be configured with TLS endpoints.