Static Analysis of Terraform code with Checkov
by Pedro Santos
April 23, 2022
In the previous post about terraform, I make a case for testing your Terraform code with Go and Terratest. For this post, I’ll be making a case for static analysis tools. Static analysis tools for Terraform are a powerful mechanism to help your team follow industry best-practices. Conversely, your organization’s infrastructure team can leverage static analysis tools and custom checks to document and enforce company-wide policies.
These tools operate on the Terraform code or in the Terraform plan. Hence, they are faster to run than an end-to-end test in Terratest. Instead of working on the Terraform infrastructure as a whole, static analysis tools focus on each resource individually.
Checkov
The software I recommend for static analysis of Terraform is Checkov. Checkov provides a comprehensive set of built-in policies for the most common pitfalls and an easy way to create custom rules. Checkov has the largest number of built-in policies, has excellent support for custom policies, and integrates very well with most CICD services.
Nowadays, Checkov supports two methods for implementing custom policies:
- Python code
- Yaml definition file
Yaml tends to be much easier to start, but for any non-trivial application, the equivalent Python code will be cleaner. If your team has experience with Python, implementing policies with it will be a gratifying experience. If your team does not have experience with Python, the cost of learning it will not outweigh the disadvantages of the Yaml method.
Example
The following is an example deployment with some checks. Like the terratest blog post, this example does not represent a real workload nor follows best practices. We’ll be using the Azure cloud for this example and custom checks implemented with Python.
Say you’d like to deploy a storage account in your organization’s subscription. According to your organization’s naming scheme, one must prefix every resource group’s name with the environment targeted (either “dev” or “prod”).
We’ll organize the code with the following structure:
example_storage_account
├── policies
│ ├── custom_policies.py
│ └── __init__.py
└── src
└── main.tf
Our initial iteration src/main.tf
will be as such:
provider "azurerm" {
version = "=2.43.0"
features {}
}
variable "environment" {
type = string
}
resource "azurerm_resource_group" "this" {
name = "${var.environment}-myrg"
location = "West Europe"
}
resource "azurerm_storage_account" "this" {
name = "mystorageaccount"
account_replication_type = "LRS"
account_tier = "Premium"
location = azurerm_resource_group.this.location
resource_group_name = azurerm_resource_group.this.name
}
Now let’s run the Terraform code through Checkov. Install the tool with:
$ pip install checkov
Now, in the directory example_storage_account
, run Checkov with:
checkov --directory src
Surprisingly, Checkov failed our deployment! Here’s the abridged output:
Check: CKV_AZURE_33: "Ensure Storage logging is enabled for Queue service for read, write and delete requests"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/enable-requests-on-storage-logging-for-queue-service
Check: CKV_AZURE_3: "Ensure that 'Secure transfer required' is set to 'Enabled'"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/ensure-secure-transfer-required-is-enabled
Check: CKV_AZURE_44: "Ensure Storage Account is using the latest version of TLS encryption"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/bc_azr_storage_2
Check: CKV_AZURE_35: "Ensure default network access rule for Storage Accounts is set to deny"
FAILED for resource: azurerm_storage_account.this
File: /main.tf:15-21
Guide: https://docs.bridgecrew.io/docs/set-default-network-access-rule-for-storage-accounts-to-deny
From the output, it looks like we are not following best-practices on our storage account. We are not enforcing HTTPS, and our default firewall rules are too permissive. Since we will not be using the queue service, we can ignore that check. Each check provides a link to a web page explaining the error and how to fix it.
Following checkov’s recommendations the documentation, we refactor our storage account as such:
provider "azurerm" {
version = "=2.43.0"
features {}
}
variable "environment" {
type = string
}
resource "azurerm_resource_group" "this" {
name = "${var.environment}-myrg"
location = "West Europe"
}
resource "azurerm_storage_account" "this" {
#checkov:skip=CKV_AZURE_33
name = "mystorageaccount"
account_replication_type = "LRS"
account_tier = "Premium"
location = azurerm_resource_group.this.location
resource_group_name = azurerm_resource_group.this.name
min_tls_version = "TLS1_2"
enable_https_traffic_only = true
network_rules {
default_action = "Deny"
ip_rules = ["80.0.2.0"]
}
}
We changed our firewall rules to deny by default and only accept our public IP. Moreover, we now enforce HTTPS and a higher TLS version. Note the comment that skips the queue check.
Re-running
$ checkov --directory src
Checkov now accepts our infrastructure.
Custom checks and Terraform plan
Now let’s create a custom policy to enforce the naming scheme. Here’s our custom_policies.py
:
import re
from checkov.common.models.enums import CheckResult, CheckCategories
from checkov.terraform.checks.resource.base_resource_check import BaseResourceCheck
class ResourceGroupPrefix(BaseResourceCheck):
def __init__(self):
super().__init__(
name="Ensure resource group is prefixed by the environment name",
id="MYORG_RG_001",
categories=[CheckCategories.CONVENTION],
supported_resources=["azurerm_resource_group"],
)
def scan_resource_conf(self, conf, entity_type):
name = conf["name"][0]
result = re.match("^(dev|prod)-", name)
if result is None:
return CheckResult.FAILED
else:
return CheckResult.PASSED
scanner = ResourceGroupPrefix()
Don’t forget to also include an empty __init__.py
file to the policies
folder.
As we pass the environment as an input, Checkov has no way to check if the Terraform code respects the resource group naming conventions.
For this case, we’ll need to perform our checks on the Terraform plan (tfplan).
In the src folder, we create the tfplan
with the input variable environment
set to dev
.
$ terraform init && terraform plan --var environment=dev --out terraform.tfplan.binary
The tfplan is in a binary format. For Checkov to understand the plan, we must first convert it to JSON:
$ terraform show --json terraform.tfplan.binary | jq '.' > terraform.tfplan.json
Now we can run Checkov with the JSON tfplan and custom policy.
On the example_storage_account
folder, execute the following command:
$ checkov --file src/terraform.tfplan.json --external-checks-dir policies
Checkov will check the custom policy and evaluate it as PASSED. Here’s the abridged output:
Passed checks: 7, Failed checks: 0, Skipped checks: 0
Check: MYORG_RG_001: "Ensure resource group is prefixed by the environment name"
PASSED for resource: azurerm_resource_group.this
File: /src/terraform.tfplan.json:18-23
I’ve hosted the complete example here.
Further reading
Checkov provides great documentation on its built-in policies and how to implement custom ones. Here you can find how to implement custom policies in Yaml and here you can find example policies. For Python, you can find documentation and examples here.
Conclusion
As a DevOps team, static analysis is yet another tool to deploy high-quality infrastructure-as-code. You can use Checkov alongside your existing Terratest code to provide an extra layer of safety. With Checkov, you can ensure your code follows best practices and is compliant with your organization’s policies.
The same caveats apply as Terratest. Using Checkov means another programming language to learn on top of Terraform, and a tradeoff between doing Terraform code and testing code. My recommendation is to start by using Checkov’s built-in policies on your local development. As you saw in the example, they provide valuable feedback and guide your development towards a more secure and reliable infrastructure.
Happy coding!