logo
linkedin
menu
logo
linkedin

Static Analysis of Terraform code with Checkov

by Pedro Santos

April 23, 2022


In the previous post about terraform, I make a case for testing your Terraform code with Go and Terratest. For this post, I’ll be making a case for static analysis tools. Static analysis tools for Terraform are a powerful mechanism to help your team follow industry best-practices. Conversely, your organization’s infrastructure team can leverage static analysis tools and custom checks to document and enforce company-wide policies.

These tools operate on the Terraform code or in the Terraform plan. Hence, they are faster to run than an end-to-end test in Terratest. Instead of working on the Terraform infrastructure as a whole, static analysis tools focus on each resource individually.

Checkov

The software I recommend for static analysis of Terraform is Checkov. Checkov provides a comprehensive set of built-in policies for the most common pitfalls and an easy way to create custom rules. Checkov has the largest number of built-in policies, has excellent support for custom policies, and integrates very well with most CICD services.

Nowadays, Checkov supports two methods for implementing custom policies:

  1. Python code
  2. Yaml definition file

Yaml tends to be much easier to start, but for any non-trivial application, the equivalent Python code will be cleaner. If your team has experience with Python, implementing policies with it will be a gratifying experience. If your team does not have experience with Python, the cost of learning it will not outweigh the disadvantages of the Yaml method.

Example

The following is an example deployment with some checks. Like the terratest blog post, this example does not represent a real workload nor follows best practices. We’ll be using the Azure cloud for this example and custom checks implemented with Python.

Say you’d like to deploy a storage account in your organization’s subscription. According to your organization’s naming scheme, one must prefix every resource group’s name with the environment targeted (either “dev” or “prod”).

We’ll organize the code with the following structure:

example_storage_account
├── policies
│   ├── custom_policies.py
│   └── __init__.py
└── src
    └── main.tf

Our initial iteration src/main.tf will be as such:

provider "azurerm" {
  version = "=2.43.0"
  features {}
}

variable "environment" {
  type = string
}

resource "azurerm_resource_group" "this" {
  name     = "${var.environment}-myrg"
  location = "West Europe"
}

resource "azurerm_storage_account" "this" {
  name                     = "mystorageaccount"
  account_replication_type = "LRS"
  account_tier             = "Premium"
  location                 = azurerm_resource_group.this.location
  resource_group_name      = azurerm_resource_group.this.name
}

Now let’s run the Terraform code through Checkov. Install the tool with:

$ pip install checkov

Now, in the directory example_storage_account, run Checkov with:

checkov --directory src

Surprisingly, Checkov failed our deployment! Here’s the abridged output:

Check: CKV_AZURE_33: "Ensure Storage logging is enabled for Queue service for read, write and delete requests"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/enable-requests-on-storage-logging-for-queue-service

Check: CKV_AZURE_3: "Ensure that 'Secure transfer required' is set to 'Enabled'"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/ensure-secure-transfer-required-is-enabled

Check: CKV_AZURE_44: "Ensure Storage Account is using the latest version of TLS encryption"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/bc_azr_storage_2

Check: CKV_AZURE_35: "Ensure default network access rule for Storage Accounts is set to deny"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/set-default-network-access-rule-for-storage-accounts-to-deny

From the output, it looks like we are not following best-practices on our storage account. We are not enforcing HTTPS, and our default firewall rules are too permissive. Since we will not be using the queue service, we can ignore that check. Each check provides a link to a web page explaining the error and how to fix it.

Following checkov’s recommendations the documentation, we refactor our storage account as such:

provider "azurerm" {
  version = "=2.43.0"
  features {}
}

variable "environment" {
  type = string
}

resource "azurerm_resource_group" "this" {
  name     = "${var.environment}-myrg"
  location = "West Europe"
}

resource "azurerm_storage_account" "this" {
  #checkov:skip=CKV_AZURE_33

  name                     = "mystorageaccount"
  account_replication_type = "LRS"
  account_tier             = "Premium"
  location                 = azurerm_resource_group.this.location
  resource_group_name      = azurerm_resource_group.this.name

  min_tls_version           = "TLS1_2"
  enable_https_traffic_only = true

  network_rules {
    default_action = "Deny"
    ip_rules       = ["80.0.2.0"]
  }
}

We changed our firewall rules to deny by default and only accept our public IP. Moreover, we now enforce HTTPS and a higher TLS version. Note the comment that skips the queue check.

Re-running

$ checkov --directory src

Checkov now accepts our infrastructure.

Custom checks and Terraform plan

Now let’s create a custom policy to enforce the naming scheme. Here’s our custom_policies.py:

import re
from checkov.common.models.enums import CheckResult, CheckCategories
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck


class ResourceGroupPrefix(BaseResourceCheck):
    def __init__(self):
        super().__init__(
            name="Ensure resource group is prefixed by the environment name",
            id="MYORG_RG_001",
            categories=[CheckCategories.CONVENTION],
            supported_resources=["azurerm_resource_group"],
        )

    def scan_resource_conf(self, conf, entity_type):
        name = conf["name"][0]
        result = re.match("^(dev|prod)-", name)
        if result is None:
            return CheckResult.FAILED
        else:
            return CheckResult.PASSED


scanner = ResourceGroupPrefix()

Don’t forget to also include an empty __init__.py file to the policies folder. As we pass the environment as an input, Checkov has no way to check if the Terraform code respects the resource group naming conventions. For this case, we’ll need to perform our checks on the Terraform plan (tfplan). In the src folder, we create the tfplan with the input variable environment set to dev.

$ terraform init && terraform plan --var environment=dev --out terraform.tfplan.binary

The tfplan is in a binary format. For Checkov to understand the plan, we must first convert it to JSON:

$ terraform show --json terraform.tfplan.binary | jq '.' > terraform.tfplan.json

Now we can run Checkov with the JSON tfplan and custom policy. On the example_storage_account folder, execute the following command:

$ checkov --file src/terraform.tfplan.json --external-checks-dir policies

Checkov will check the custom policy and evaluate it as PASSED. Here’s the abridged output:

Passed checks: 7, Failed checks: 0, Skipped checks: 0
Check: MYORG_RG_001: "Ensure resource group is prefixed by the environment name"
PASSED for resource: azurerm_resource_group.this
File: /src/terraform.tfplan.json:18-23

I’ve hosted the complete example here.

Further reading

Checkov provides great documentation on its built-in policies and how to implement custom ones. Here you can find how to implement custom policies in Yaml and here you can find example policies. For Python, you can find documentation and examples here.

Conclusion

As a DevOps team, static analysis is yet another tool to deploy high-quality infrastructure-as-code. You can use Checkov alongside your existing Terratest code to provide an extra layer of safety. With Checkov, you can ensure your code follows best practices and is compliant with your organization’s policies.

The same caveats apply as Terratest. Using Checkov means another programming language to learn on top of Terraform, and a tradeoff between doing Terraform code and testing code. My recommendation is to start by using Checkov’s built-in policies on your local development. As you saw in the example, they provide valuable feedback and guide your development towards a more secure and reliable infrastructure.

Happy coding!