Starting an IaC Repository with GitHub and Terraform
There are only two hard things in Computer Science: cache invalidation, naming things, and off by one errors.
- Loosely attributed to Phil Karlton
Let's start off with a bit of a hot take - Terraform isn't particularly hard to learn. It does use unique configuration languages, but most people don't struggle with learning the code.
Infrastructure-as-Code (IaC) isn't about the programming language - it's about establishing a body of discipline around managing infrastructure. Tools like Ansible and Terraform simply facilitate the practice.
Instead of focusing on some programmatically elegant tricks here, let's try to focus on how to build a "starter kit" of sorts to build upon this practice. The managed resources in this example will be intentionally simple to shift focus to the structure, naming, and release management aspects of Infrastructure-as-Code.
Repositories (Structure and Naming)
Start a GitHub repository with some basic documentation before contributing code:
README.mdshould describe what the project is for, describe the project structure: how the software works.USAGE.mdshould describe how to consume resources within the project, how release management works.CONTRIBUTING.mdshould describe how to contribute to the codebase: the branch and merge workflows and rules of conduct go here.CHANGELOG.mdshould be created based on the Keep a Changelog standards.gitignoreshould make sure that any temporary files created by tools, likepycache, Terraform locks don't accidentally get committed to the repositorymarkdownlint.jsonand any other linting rules - automated code QC is a good thingimg/should be created to contain rendered images for documentation. Use illustrations to make the repository easy to understand!dwg/should be created to contain unrendered diagrams, e.g.svg,d2doc/may be created for any automatically rendered documentation, e.g. ReadTheDocs
Once these are created, start mapping out what loose structures should be included in the repository. Here are some examples:
conf.d/for any flat file configurations that may get deployed- Make subdirectories for any machine targets
roles/for any Ansible roles. Since this is IaC, breaking this down into roles instead of one giant pile will be simpler- Within each
role:templates/should contain any Jinja2 templates. Ansible will auto-detect this folder by name, and it simplifies structure quite a bit.requirements.txtshould contain any software prerequisites for the Ansible playbooks. This facilitates CI/CD tooling with virtual environments, in addition to better documenting software dependencies.- Playbooks and truth files, of course
- Within each
terraform/for any Terraform codemodules/for any Terraform re-usable modulesaccounts/for any Terraform tenants, e.g. AWS Accounts, CloudFlare accounts, or other unrelated resources to keep them separate and organized
python/for any Python codejs/for any JavaScript- ...and so on.
Now that the raw structure is somewhat laid out, we can shift focus to the Terraform account's subdirectory (in /terraform/accounts/{{ account_type }}_{{ account_id }}_{{account_name}}) structure. Here's what I've seen lead to a maintainable code base:
/terraform/accounts/cloudflare_12345_engyak_cotemplates/for anygotmpltemplatesprovider.tfshould declare any Terraform pre-requisites, e.g. the Cloudflare provider minimum versionvars.tfshould declare any input variables. In my experience, this is a good place for module inputs, but not as useful for actual infrastructure declarationslocals.tfshould declare any Don't Repeat Yourself (DRY) variables. I typically use them for consistent resource names and IDs. There are a lot of opinions aboutvarsversuslocals, but there are a few key differences:varsshould actually be variable (non-static multiples of aresource)localscan render and iterate on an input, e.g. withfor_eachloops
backend.tfshould indicate whereterraform.tfstateis placed, any file locking. Normally, this points to an S3 bucket and provides authorization for itdata.tfshould have any external data resources. This example doesn't need any, but AWS IAM policy documents and S3 bucket policies fit this category. Any resource prefixed withdatainstead ofresourcegoes here, essentially
Now that all that's out of the way, we're able to actually create resources. Things can be a lot more free-form here, because the definition of related resources can vary greatly based on who's doing the work.
My personal preference is to maintain small, easily readable files that function independently wherever possible. In this example, we'll use one file for each DNS zone. Here's /terraform/accounts/cloudflare_youwish_engyak_co/engyak.co.tf:
1resource "cloudflare_record" "engyak_co_blog" {
2 content = "blog-engyak-co.pages.dev"
3 name = "blog"
4 proxied = false
5 ttl = 1
6 type = "CNAME"
7 zone_id = "redacted"
8}
9
10resource "cloudflare_record" "engyak_co_root" {
11 content = "blog-engyak-co.pages.dev"
12 name = "engyak.co"
13 proxied = true
14 ttl = 1
15 type = "CNAME"
16 zone_id = "redacted"
17}
18
19resource "cloudflare_record" "engyak_co_uri_blog" {
20 name = "engyak.co"
21 priority = 1
22 proxied = false
23 ttl = 1
24 type = "URI"
25 zone_id = "redacted"
26 data {
27 target = "blog.engyak.co"
28 weight = 1
29 }
30}
These resources are built according to the provider in provider.tf:
1terraform {
2 required_providers {
3 cloudflare = {
4 source = "cloudflare/cloudflare"
5 version = "~> 4"
6 }
7 }
8}
9
10provider "cloudflare" {
11}
Always consult the provider's documentation on how to use their resources.
Actions (Release Management)
The biggest advantage a Git repository has for Infrastructure-as-Code is its versioning capability, but the ability to control the release of changes can really take things to the next level.
First, I'd recommend starting out with a branch management plan. It can start simple, like:
- Don't allow any commits directly to
main(GitHub branch protection rules, plus general threads inCONTRIBUTING.md) - Only allow code to be pushed to
mainvia a successful pull request (GitHub branch protection rules do this as well)- At least 1 approving peer review
- All testing must PASS (more on this later)
- All prospective changes must start as a diverging branch (or fork, but forking is much more advanced) that is up-to-date with
main - Outline appropriate change windows, if applicable
At this point, the rules are in place, but none of it actually controls release. GitHub doesn't have credentials to release changes; ideally no users should either. The objective here is to prevent all direct changes to infrastructure. This can be achieved with AWS IAM roles, Cloudflare RBAC, or an equivalent. Take away the keys!
GitHub Actions provides a (usually free or cheap) amnesic container service to run ephemeral code from source control. This is going to be the foundation for this example moving forward, but other providers like GitLab and Atlassian have equivalents as well. If the source control provider doesn't have a built-in service, plenty of other CI tools exist to fill that gap, like Jenkins and Concourse.
For a Terraform pipeline, there should be two Actions per account:
terraform plan: This will test your code for validity, and also explain any potential impacts the change might haveterraform apply: This will implement tested changes. This Action should be restricted to themainbranch!
Here's an example plan Action. I named it based on `{{ event trigger }}: {{ provider }} {{ action }} to keep things organized.
1---
2name: 'On-Commit: Cloudflare Terraform Plan'
3
4on:
5 push:
6
7permissions:
8 contents: read
9
10jobs:
11 plan:
12 name: 'Terraform Plan'
13 env:
14 CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
15 runs-on: ubuntu-latest
16 steps:
17 - uses: actions/checkout@v4
18 - name: 'Terraform Setup'
19 uses: hashicorp/setup-terraform@v3
20 with:
21 terraform_version: '>= 1.10.5'
22 - name: 'Terraform Plan'
23 run: |
24 terraform init
25 terraform validate
26 terraform plan -input=false
27 working-directory: terraform/accounts/cloudflare_youwish_engyak_co/
Here's a rundown on how the testing works:
- We use the
envdirective to exposeCLOUDFLARE_API_TOKEN(specified in thecloudflareprovider as the way to pass credentials) - We use
actions/checkout@v4(or latest version) to load a copy ofmaininto the Actions runner - We use
hashicorp/setup-terraform@v3. Previous Actions runners shipped with Terraform, but the base image didn't update this package frequently enough. Now it doesn't ship with the image - but this tool lets us restrict and control software versions as part of the pipeline. This lets us slow releases if breaking changes occur withterraformwithout having to monkey around with internals - it's a much better system. - The
Terraform Planstep is where most of the work gets done. We initialize Terraform in non-interactive mode (-input=false) using our workspace with theworking-directorykey.
This will now run every time code is committed to the repository, and it'll display any expected changes every time code is contributed. If it fails, it will produce an error and (ideally) notify engineers/developers on where to fix it.
Note: terraform validate and terraform plan do not catch all problems, just test for config validity. Resource conflicts, API idiosyncrasies will pass this step and only reveal things on apply!
Now, we can finally start releasing changes:
1---
2name: 'Cron-Demand: Cloudflare Terraform Apply'
3
4on:
5 workflow_dispatch:
6 branches: ['main']
7 schedule:
8 - cron: "15 4,5 * * *"
9
10permissions:
11 contents: read
12
13jobs:
14 plan:
15 name: 'Terraform Plan'
16 env:
17 CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
18 runs-on: ubuntu-latest
19 steps:
20 - uses: actions/checkout@v4
21 - name: 'Terraform Setup'
22 uses: hashicorp/setup-terraform@v3
23 with:
24 terraform_version: '>= 1.10.5'
25 - name: 'Terraform Plan'
26 id: tf_plan
27 run: |
28 terraform init
29 terraform validate
30 terraform plan -input=false --detailed-exitcode
31 continue-on-error: true
32 working-directory: terraform/accounts/cloudflare_youwish_engyak_co/
33 - name: 'Terraform Apply'
34 run: |
35 terraform apply
36 working-directory: terraform/accounts/cloudflare_youwish_engyak_co/
37 if: github.ref != 'refs/heads/main' && needs.tf_plan.outputs.exit-code == 2
This Action will either run daily at 0415-0515 UTC or if executed manually. We've established a "change window", and there are quite a few more complexities added to this workflow to implemet change safety:
detailed-exitcodeandid: tf_planallow us to "catch" the results ofterraform plan. A return code of0means no changes required, and2means changes are required.if:conditionals restrict the dangerous parts of the workflow to only execute when the branch ismainandplanis valid and expects changes.
Terraform Starter Kit
This template should act as a foundational "starter kit" for establishing an effective, robust, mature Infrastructure-as-Code practice. I've found that it's easier to modify and improve an existing process than to start anew - the objective here is to get engineers past that "writer's block."
Happy coding!