Skip to main content

Infrapolicies - policy as code

What's infrapolicies---policy-as-code ?

Introduction

Infrastructure Policies (InfraPolicy) are an implementation of Policy as Code. The idea is to have fine-grained control over changes across an organization's infrastructures while simultaneously defining validation rules.

The validation process is based on the Open Policy Agent engine, which allows us to write a set of rules using the Rego language to express validation behavior.

The infrastructure changes are represented by changes in the Terraform Plan that must respect the defined rules. For example, you can ensure that tags are applied to all resources or ensure that no instance bigger than t2.large has been used, etc.

Concepts

An InfraPolicy is represented by a Body that contains the set of rules and a Severity that defines the enforcement level.

Body

The rules deny or allow changes are written using the Rego language. Currently only allow and deny rules are checked. See examples below:

The Input body for the OPA engine during validation will include the following fields:

{
"environment_canonical": "env-example-canonical",
"project_canonical": "proj-example-canonical",
"tfplan": {
...
}
}

This allows rules in the Body code to include checks on environment_canonical and project_canonical :

package example

default allow = false

allow {
input.project_canonical == "proj-example-canonical"
input.environment_canonical == "env-example-canonical"
}

The tfplan represents Terraform Plan in JSON format. All the rules must use dot notation to reference specific fields:

package example

default allow = false

allow {
input.tfplan.resource_changes[_].change.after["max_size"] < 5
}

allow rule must return a direct boolean value

package example

default allow = false

allow {
input.project_canonical == "proj-example-canonical"
}

And the deny rule must return an array of strings containing the reasons for the failure:

package example

deny[reason] {
not input.project_canonical == "proj-example-canonical"

reason := sprintf("The project canonical %q is not expected", [input.project_canonical])
}
warning

allow and deny are opposite concepts, this means allow will fail if the result of the rule is false, and deny will fail if the rule is true.

The OPA documentation contains useful information on how to write policies that include deny rules, and defines best practices to follow in order to build reliable policy systems.

Severity

The client checks the Severity level to decide what to do with changes in the event an InfraPolicy has not been respected.

  • Critical: the changes must be blocked
  • Warning: the changes are blocked but they can be overridden with a manual operation
  • Advisory: the changes can be automatically applied but a notification must be sent

Status

The status can be enabled or disabled. When it is disabled the InfraPolicy will be excluded from the validation process.

Testing

The OPA engine has a dedicated command for testing. The best practices suggested in the OPA testing documentation should be followed. Other than the opa test command, the OPA ecosystem has a Playground that can be used for fast and simple assertions.

Validation

In order to validate your Terraform Plan against the defined policies, you can use of the following methods:

Locally

Terraform Plan can be validated locally using Cycloid cli with the validate subcommand:

$ terraform plan -out=./plan; terraform show -json ./plan > plan.json
$ cy infrapolicy validate --plan-path ./plan.json
ADVISORIES CRITICALS WARNINGS
1 0 0

In Cycloid pipeline

In a pipeline context, a Concourse Resource is available and can be easily plugged right after a terraform plan step and just before a notification mechanism.

Example of output

Validation with advisories, the job is green and display advisories result as metadata:

Advisories

Validation with criticals and/or warnings, the job fails and display result as metadata:

Criticals

Setup

  • Create a new policy from the Security / InfraPolicies page by clicking on the Add InfraPolicy button.

  • Fill in the mandatory fields and click on Save to add the resource. Enable the policy to include it in the next validation run.

Create InfraPolicy

  • Add the pipeline and a validation resource as described before.

  • Try to submit some unexpected infrastructure changes that go against your policies (e.g. try to double the size of the instance) to see InfraPolicy in action.

Code examples

Examples of Rego code for some common use cases. Other examples can be found in the fugue/regula or open-policy-agent/conftest repositories.

Tags required

Allow only resources with defined tags.

package example

deny[reason] {
resource := input.tfplan.planned_values.root_module.resources[_]
not resource.values.tags

reason = sprintf("tags required for the resource %q", [resource.address])
}

Instance type

Allow only specific types of instances.

package example

allowed_instance_types = {
"t2.medium",
"t2.large",
"t2.xlarge"
}

deny[reason] {
itype := input.tfplan.resource_changes[_].change.after["instance_type"]
not allowed_instance_types[itype]

reason = sprintf("instance_type %q is not accepted, use one of the allowed: %v", [itype, allowed_instance_types])
}

Instance quantity

Define the maximum amount of running instances that the Auto Scaling Group can spawn.

package example

default allow = false

allow {
input.tfplan.resource_changes[_].change.after["max_size"] < 5
}

Only a specific region

Allow only a certain set of cloud provider regions.

package example

default allow = false

allow {
provider := input.tfplan.configuration.provider_config.aws
provider.expressions.region.constant_value == "eu-west-1"
}

Security group required

Make the security group mandatory.

package example

deny[reason] {
r := input.tfplan.resource_changes[_]
r.change.after_unknown.vpc_security_group_ids == true

reason := "A security group is required"
}

Specific AMI

Allow only a specific set of AWS AMI.

package example

import input.tfplan as tfplan

allowed_amis = {
"ami-abc",
"ami-xyz"
}

deny[reason] {
ami := tfplan.resource_changes[_].change.after.image_id
not allowed_amis[ami]

reason := sprintf("AWS AMI %q is not accepted, use one of the allowed: %v", [ami, allowed_amis])
}

RDS Backup set

Allow only RDS databases with a backup set in the production environment.

package example

is_production {
input.environment_canonical == "prod"
}

deny[reason] {
is_production

resource := input.tfplan.planned_values.root_module.resources[_]
resource.type == "aws_db_instance"
not resource.values.backup_retention_period > 0

reason = sprintf("Backup is required on production for %q", [resource.address])
}