Building a Comprehensive IaC Security Pipeline: From Detection to Remediation
Key takeaways
- IaC security scanning detects misconfigurations before deployment, enabling shift-left security without slowing development
- Multi-layered approach combining static scanners (Checkov, tfsec), policy engines (OPA), and secret detection prevents 90%+ of common issues
- Start with strict enforcement on critical issues, then relax selectively with documented suppressions and peer review
- Pre-commit hooks and IDE integration catch 60-80% of security issues before code review, reducing remediation time
- Progressive rollout (visibility β soft enforcement β full enforcement) minimizes disruption while building security culture
The shift to Infrastructure as Code fundamentally transformed how we think about security. Instead of auditing running infrastructure after deployment, we can now detect security issues, policy violations, and compliance gaps before a single resource is created. This shift-left approach enables development velocity while maintaining security postureβbut only if implemented correctly.
This guide provides a comprehensive framework for building production-grade security automation into your IaC pipeline, covering tooling, implementation patterns, policy management, and organizational processes.
The Security Scanning Landscape
Categories of IaC Security Tools
The IaC security ecosystem has matured into several distinct categories, each serving specific purposes:
1. Static Security Scanners
- Purpose: Detect known security misconfigurations and vulnerabilities
- Examples: Checkov, tfsec, Terrascan, cfn-nag
- Strengths: Fast, rule-based detection of common security issues
- Limitations: Can't understand business context or complex relationships
2. Policy-as-Code Engines
- Purpose: Enforce custom organizational policies and standards
- Examples: Open Policy Agent (OPA), HashiCorp Sentinel, AWS Config Rules
- Strengths: Flexible, custom policy logic tailored to your organization
- Limitations: Requires policy authoring expertise
3. Compliance Frameworks
- Purpose: Validate infrastructure against regulatory standards
- Examples: Prowler, AWS Security Hub, Cloud Custodian
- Strengths: Pre-built compliance mappings (SOC 2, HIPAA, PCI-DSS)
- Limitations: May require runtime scanning in addition to static analysis
4. Secret Detection Tools
- Purpose: Prevent credentials and secrets from being committed to code
- Examples: TruffleHog, git-secrets, GitGuardian
- Strengths: Catches accidental credential exposure
- Limitations: High false positive rates without tuning
5. Software Composition Analysis (SCA)
- Purpose: Scan IaC dependencies for vulnerabilities
- Examples: Dependabot, Snyk, Renovate
- Strengths: Keeps providers and modules up-to-date
- Limitations: May suggest breaking changes
Tool Deep Dive: Choosing the Right Scanner
Checkov: The Comprehensive Choice
Why Checkov Stands Out:
- Multi-framework support (Terraform, CloudFormation, Kubernetes, Helm, ARM, CDK)
- 1,000+ built-in policies covering CIS benchmarks, NIST, PCI-DSS
- Active development and community
- Free and open-source
- Graph-based analysis for complex relationships
- Integration with Bridgecrew platform (optional)
When to Choose Checkov:
- You need multi-cloud/multi-framework support
- You want comprehensive out-of-the-box policies
- You value active community support
- You may want commercial support options later
tfsec: Fast and Focused
Why tfsec Stands Out:
- Extremely fast (written in Go)
- Terraform-specific, deeply integrated
- Clear, actionable output
- Low false positive rate
- Custom check support
When to Choose tfsec:
- You're Terraform-only
- Speed is critical (large codebases)
- You want minimal configuration
- You prefer focused tooling over comprehensive platforms
Terrascan: Policy-Driven Approach
Why Terrascan Stands Out:
- OPA (Rego) policy engine integration
- Multi-framework support
- Admission controller for Kubernetes
- Custom policy flexibility
When to Choose Terrascan:
- You're already using OPA
- You need admission control capabilities
- You want to write custom policies in Rego
- Kubernetes is part of your infrastructure
Comparative Analysis
| Feature | Checkov | tfsec | Terrascan | cfn-nag |
|---|---|---|---|---|
| Terraform | β | β | β | β |
| CloudFormation | β | β | β | β |
| CDK | β | β | β | β (synth) |
| Kubernetes | β | β | β | β |
| Custom Policies | Python | Go/JSON | Rego (OPA) | Ruby |
| Speed | Medium | Fast | Medium | Medium |
| Policy Count | 1000+ | 300+ | 500+ | 200+ |
| Graph Analysis | β | β | β | β |
Building the Security Pipeline
Level 1: Basic GitHub Actions Integration
Start with a simple workflow that blocks obvious security issues:
name: IaC Security Scan
on:
pull_request:
branches: [main, develop]
paths:
- 'terraform/**'
- 'infrastructure/**'
- '.github/workflows/security.yml'
permissions:
contents: read
pull-requests: write
security-events: write
jobs:
security-scan:
name: Security Scan
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for better analysis
- name: Run Checkov
id: checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: ./terraform
framework: terraform
soft_fail: false
download_external_modules: true
output_format: cli,sarif
output_file_path: console,results.sarif
- name: Upload SARIF results
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
category: checkovKey Features:
- Runs on PR to main/develop branches
- Only triggers when IaC files change (efficiency)
- Uploads results to GitHub Security tab (SARIF format)
- Fails the build on security issues
Level 2: Multi-Tool Defense in Depth
Layer multiple tools for comprehensive coverage:
name: IaC Security Suite
on:
pull_request:
branches: [main, develop]
schedule:
- cron: '0 0 * * 0' # Weekly full scan
jobs:
static-analysis:
name: Static Security Analysis
runs-on: ubuntu-latest
strategy:
matrix:
scanner: [checkov, tfsec, terrascan]
fail-fast: false # Run all scanners even if one fails
steps:
- uses: actions/checkout@v4
- name: Run Checkov
if: matrix.scanner == 'checkov'
uses: bridgecrewio/checkov-action@v12
with:
directory: ./terraform
framework: terraform
output_format: sarif
output_file_path: checkov.sarif
- name: Run tfsec
if: matrix.scanner == 'tfsec'
uses: aquasecurity/tfsec-action@v1.0.3
with:
working_directory: ./terraform
format: sarif
sarif_file: tfsec.sarif
- name: Run Terrascan
if: matrix.scanner == 'terrascan'
uses: tenable/terrascan-action@v1.5.0
with:
iac_type: 'terraform'
iac_dir: './terraform'
sarif_upload: true
- name: Upload results
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: ${{ matrix.scanner }}.sarif
category: ${{ matrix.scanner }}
secret-scan:
name: Secret Detection
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
extra_args: --only-verified
dependency-scan:
name: Dependency Security
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Snyk IaC
uses: snyk/actions/iac@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
file: ./terraform
args: --severity-threshold=highMulti-Layer Benefits:
- Different tools catch different issues
- Cross-validation reduces false negatives
- Tool-specific strengths complement each other
- Secret scanning prevents credential leaks
- Dependency scanning keeps modules secure
Level 3: Policy as Code with OPA
Implement custom organizational policies:
name: Policy Enforcement
on:
pull_request:
branches: [main]
jobs:
policy-check:
name: OPA Policy Validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup OPA
uses: open-policy-agent/setup-opa@v2
with:
version: latest
- name: Terraform Plan
run: |
cd terraform
terraform init -backend=false
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
- name: Validate against policies
run: |
opa eval \
--data policies/ \
--input terraform/tfplan.json \
--format pretty \
'data.terraform.deny' \
| tee policy-results.txt
# Fail if any denials found
if grep -q "true" policy-results.txt; then
echo "Policy violations found!"
exit 1
fiExample OPA Policy (Rego):
# policies/terraform.rego
package terraform
import future.keywords.in
import future.keywords.if
# Deny if any S3 bucket is publicly accessible
deny[msg] {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_s3_bucket"
resource.values.acl == "public-read"
msg := sprintf(
"S3 bucket '%s' must not be publicly readable. Use CloudFront instead.",
[resource.address]
)
}
# Deny if RDS instance is not in private subnet
deny[msg] {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_db_instance"
not resource.values.publicly_accessible == false
msg := sprintf(
"RDS instance '%s' must not be publicly accessible.",
[resource.address]
)
}
# Require specific tags on all resources
required_tags := ["Environment", "Owner", "CostCenter", "Application"]
deny[msg] {
resource := input.planned_values.root_module.resources[_]
resource_tags := object.get(resource.values, "tags", {})
missing_tags := [tag |
tag := required_tags[_]
not resource_tags[tag]
]
count(missing_tags) > 0
msg := sprintf(
"Resource '%s' is missing required tags: %v",
[resource.address, missing_tags]
)
}
# Enforce encryption at rest for all EBS volumes
deny[msg] {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_ebs_volume"
not resource.values.encrypted == true
msg := sprintf(
"EBS volume '%s' must have encryption enabled.",
[resource.address]
)
}
# Enforce MFA delete on S3 buckets with sensitive data
deny[msg] {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_s3_bucket"
tags := object.get(resource.values, "tags", {})
data_classification := object.get(tags, "DataClassification", "")
data_classification in ["Confidential", "Restricted"]
not has_versioning_with_mfa(resource.address)
msg := sprintf(
"S3 bucket '%s' contains sensitive data and requires MFA delete protection.",
[resource.address]
)
}
has_versioning_with_mfa(bucket_address) if {
resource := input.planned_values.root_module.resources[_]
resource.type == "aws_s3_bucket_versioning"
resource.values.bucket == bucket_address
resource.values.mfa_delete == "Enabled"
}Level 4: GitLab CI/CD Integration
For teams using GitLab:
# .gitlab-ci.yml
stages:
- validate
- security
- plan
variables:
TERRAFORM_VERSION: "1.6.0"
checkov_scan:
stage: security
image: bridgecrew/checkov:latest
script:
- checkov -d terraform/ --framework terraform --output cli --output json --output-file-path console,checkov-results.json
artifacts:
reports:
junit: checkov-results.json
paths:
- checkov-results.json
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
changes:
- terraform/**/*
- infrastructure/**/*
tfsec_scan:
stage: security
image: aquasec/tfsec:latest
script:
- tfsec terraform/ --format junit --out tfsec-results.xml
- tfsec terraform/ --format json --out tfsec-results.json
artifacts:
reports:
junit: tfsec-results.xml
paths:
- tfsec-results.json
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
terrascan_scan:
stage: security
image: tenable/terrascan:latest
script:
- terrascan scan -t terraform -d terraform/ -o sarif > terrascan-results.sarif
- terrascan scan -t terraform -d terraform/ -o json > terrascan-results.json
artifacts:
paths:
- terrascan-results.sarif
- terrascan-results.json
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
secrets_scan:
stage: security
image: trufflesecurity/trufflehog:latest
script:
- trufflehog git file://. --since-commit $CI_MERGE_REQUEST_DIFF_BASE_SHA --only-verified --fail
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
allow_failure: falseAdvanced Configuration and Tuning
Severity-Based Enforcement
Not all security issues are equal. Implement tiered enforcement:
# .checkov.yml
framework:
- terraform
soft-fail: false
# Fail only on HIGH and CRITICAL
hard-fail-on:
- HIGH
- CRITICAL
# Warn on MEDIUM and LOW
soft-fail-on:
- MEDIUM
- LOW
# Skip specific checks globally
skip-check:
- CKV_AWS_1 # S3 bucket logging (not required for dev environments)
# Custom policies directory
external-checks-dir:
- ./security-policies/custom-checks
# Output configuration
output:
- cli
- json
- sarif
compact: false
quiet: falseEnvironment-Specific Configuration
Apply different security standards based on environment:
name: Environment-Aware Security
on:
pull_request:
branches: [main, develop, staging]
jobs:
security-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Determine environment
id: env
run: |
if [[ "${{ github.base_ref }}" == "main" ]]; then
echo "environment=production" >> $GITHUB_OUTPUT
echo "severity=CRITICAL,HIGH" >> $GITHUB_OUTPUT
elif [[ "${{ github.base_ref }}" == "staging" ]]; then
echo "environment=staging" >> $GITHUB_OUTPUT
echo "severity=HIGH" >> $GITHUB_OUTPUT
else
echo "environment=development" >> $GITHUB_OUTPUT
echo "severity=CRITICAL,HIGH,MEDIUM" >> $GITHUB_OUTPUT
fi
- name: Run Checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: ./terraform/environments/${{ steps.env.outputs.environment }}
framework: terraform
check: ${{ steps.env.outputs.severity }}
config_file: .checkov-${{ steps.env.outputs.environment }}.ymlProduction Config (.checkov-production.yml):
framework: [terraform]
hard-fail-on: [CRITICAL, HIGH]
compact: false
# Stricter checks for production
skip-check: []
# Require all best practices
check:
- CKV_AWS_*
- CKV2_AWS_*Development Config (.checkov-development.yml):
framework: [terraform]
hard-fail-on: [CRITICAL]
soft-fail-on: [HIGH, MEDIUM]
# More lenient for dev environments
skip-check:
- CKV_AWS_7 # KMS encryption (cost optimization in dev)
- CKV_AWS_20 # S3 logging (not needed in dev)
- CKV_AWS_21 # S3 versioning (not needed in dev)Handling False Positives and Exceptions
Inline Suppressions with Justification
Always require documentation for suppressions:
# Example 1: Public S3 bucket for website hosting
resource "aws_s3_bucket" "website" {
# checkov:skip=CKV_AWS_18:Public access required for static website hosting
# checkov:skip=CKV_AWS_21:Versioning not required for ephemeral static assets
# Approved by: Security Team (JIRA-1234)
# Review Date: 2025-12-31
bucket = "my-public-website"
}
resource "aws_s3_bucket_public_access_block" "website" {
# checkov:skip=CKV_AWS_53:Intentionally allowing public access for website
bucket = aws_s3_bucket.website.id
block_public_acls = false
block_public_policy = false
ignore_public_acls = false
restrict_public_buckets = false
}
# Example 2: Security group for ALB (needs 0.0.0.0/0)
resource "aws_security_group" "alb" {
# checkov:skip=CKV_AWS_260:ALB needs to accept traffic from internet
# This is protected by WAF rules (see aws_wafv2_web_acl.main)
name = "alb-security-group"
description = "Security group for public-facing ALB"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTPS from internet"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
# Example 3: Development database (relaxed security)
resource "aws_db_instance" "dev" {
# checkov:skip=CKV_AWS_16:Backup retention not required for dev database
# checkov:skip=CKV_AWS_129:Multi-AZ not cost-effective for dev environment
# Environment: Development only - Do not use this pattern in production
count = var.environment == "development" ? 1 : 0
identifier = "dev-database"
engine = "postgres"
instance_class = "db.t3.micro"
backup_retention_period = 0
multi_az = false
storage_encrypted = true # Still require encryption!
}Centralized Exception Management
For larger teams, manage exceptions centrally:
# .checkov.baseline
{
"check_type": "terraform",
"results": {
"failed_checks": [
{
"check_id": "CKV_AWS_18",
"file": "/terraform/s3.tf",
"resource": "aws_s3_bucket.website",
"justification": "Public website bucket - approved by Security Team",
"approved_by": "security@company.com",
"approval_date": "2025-01-15",
"review_date": "2025-07-15",
"jira_ticket": "SEC-1234"
},
{
"check_id": "CKV_AWS_260",
"file": "/terraform/alb.tf",
"resource": "aws_security_group.alb",
"justification": "ALB requires internet access - protected by WAF",
"approved_by": "security@company.com",
"approval_date": "2025-01-20",
"review_date": "2025-07-20",
"jira_ticket": "SEC-1235"
}
]
}
}Run with baseline:
checkov -d . --baseline .checkov.baselineException Governance Process
Implement a formal exception process:
-
Developer requests exception:
- Creates JIRA ticket with justification
- Adds inline suppression with ticket reference
- Documents compensating controls
-
Security team reviews:
- Validates business justification
- Confirms compensating controls
- Sets review/expiration date
-
Automated tracking:
- Monthly report of all active exceptions
- Alerts for expiring exceptions
- Automatic revalidation triggers
Exception Tracking Script:
#!/usr/bin/env python3
"""
Track and report on Checkov suppressions
"""
import re
from datetime import datetime
from pathlib import Path
def extract_suppressions(terraform_dir):
"""Extract all checkov:skip comments from Terraform files"""
suppressions = []
for tf_file in Path(terraform_dir).rglob("*.tf"):
with open(tf_file) as f:
for line_num, line in enumerate(f, 1):
if "checkov:skip" in line:
# Extract check ID and justification
match = re.search(r'checkov:skip=([^:]+):(.+)', line)
if match:
suppressions.append({
'file': str(tf_file),
'line': line_num,
'check_id': match.group(1),
'justification': match.group(2).strip(),
})
return suppressions
def find_expiring_exceptions(suppressions, days=30):
"""Find exceptions expiring within N days"""
expiring = []
for s in suppressions:
# Look for review dates in comments
review_match = re.search(r'Review Date:\s*(\d{4}-\d{2}-\d{2})',
s['justification'])
if review_match:
review_date = datetime.strptime(review_match.group(1), '%Y-%m-%d')
days_until = (review_date - datetime.now()).days
if days_until <= days:
expiring.append({**s, 'days_until_review': days_until})
return expiring
# Generate monthly report
suppressions = extract_suppressions('./terraform')
print(f"Total suppressions: {len(suppressions)}")
expiring = find_expiring_exceptions(suppressions, days=30)
if expiring:
print(f"\nβ οΈ {len(expiring)} exceptions expiring within 30 days:")
for e in expiring:
print(f" - {e['file']}:{e['line']} ({e['check_id']}) - "
f"Review in {e['days_until_review']} days")Common Security Issues and Remediation
Critical: S3 Bucket Security
Issue: Unencrypted S3 buckets
# β Insecure
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
# β
Secure
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
resource "aws_s3_bucket_server_side_encryption_configuration" "data" {
bucket = aws_s3_bucket.data.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.data.arn
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_public_access_block" "data" {
bucket = aws_s3_bucket.data.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_s3_bucket_versioning" "data" {
bucket = aws_s3_bucket.data.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_logging" "data" {
bucket = aws_s3_bucket.data.id
target_bucket = aws_s3_bucket.logs.id
target_prefix = "s3-access-logs/"
}Critical: Security Group Misconfigurations
Issue: Overly permissive security groups
# β Insecure - SSH open to world
resource "aws_security_group" "app" {
name = "app-sg"
vpc_id = aws_vpc.main.id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
# β
Secure - Use SSM Session Manager instead
resource "aws_security_group" "app" {
name = "app-sg"
description = "Application security group - no direct SSH access"
vpc_id = aws_vpc.main.id
# No SSH ingress rules - use AWS Systems Manager Session Manager
# Only application traffic from ALB
ingress {
description = "HTTP from ALB"
from_port = 8080
to_port = 8080
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
description = "Allow all outbound"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "app-security-group"
}
}
# IAM role for SSM access
resource "aws_iam_role" "ssm_role" {
name = "ec2-ssm-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy_attachment" "ssm_policy" {
role = aws_iam_role.ssm_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}High: Database Security
Issue: Publicly accessible RDS instances
# β Insecure
resource "aws_db_instance" "main" {
identifier = "production-db"
engine = "postgres"
publicly_accessible = true
storage_encrypted = false
backup_retention_period = 0
}
# β
Secure
resource "aws_db_instance" "main" {
identifier = "production-db"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.large"
# Security
publicly_accessible = false
storage_encrypted = true
kms_key_id = aws_kms_key.rds.arn
iam_database_authentication_enabled = true
# High availability
multi_az = true
backup_retention_period = 30
backup_window = "03:00-04:00"
maintenance_window = "mon:04:00-mon:05:00"
# Network
db_subnet_group_name = aws_db_subnet_group.private.name
vpc_security_group_ids = [aws_security_group.rds.id]
# Monitoring
enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]
monitoring_interval = 60
monitoring_role_arn = aws_iam_role.rds_monitoring.arn
# Encryption in transit
ca_cert_identifier = "rds-ca-rsa2048-g1"
# Deletion protection
deletion_protection = true
skip_final_snapshot = false
final_snapshot_identifier = "production-db-final-${formatdate("YYYY-MM-DD-hhmm", timestamp())}"
tags = {
Name = "production-database"
Environment = "production"
Compliance = "SOC2,HIPAA"
}
}
resource "aws_db_subnet_group" "private" {
name = "rds-private-subnet-group"
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "RDS private subnet group"
}
}
resource "aws_security_group" "rds" {
name = "rds-security-group"
description = "RDS security group"
vpc_id = aws_vpc.main.id
ingress {
description = "PostgreSQL from application layer"
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id]
}
tags = {
Name = "rds-security-group"
}
}High: IAM Overprivileging
Issue: Overly permissive IAM policies
# β Insecure - wildcard permissions
resource "aws_iam_role_policy" "app" {
role = aws_iam_role.app.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "*"
Resource = "*"
}]
})
}
# β
Secure - least privilege
resource "aws_iam_role_policy" "app" {
role = aws_iam_role.app.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "S3ReadAccess"
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:ListBucket"
]
Resource = [
aws_s3_bucket.app_data.arn,
"${aws_s3_bucket.app_data.arn}/*"
]
},
{
Sid = "DynamoDBAccess"
Effect = "Allow"
Action = [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:Query"
]
Resource = aws_dynamodb_table.app.arn
},
{
Sid = "KMSDecrypt"
Effect = "Allow"
Action = [
"kms:Decrypt",
"kms:DescribeKey"
]
Resource = aws_kms_key.app.arn
}
]
})
}Medium: Missing Resource Tagging
Issue: Untagged resources
# β No tags
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t3.micro"
}
# β
Comprehensive tagging strategy
locals {
common_tags = {
Environment = var.environment
Application = "web-app"
Owner = "platform-team@company.com"
CostCenter = "engineering"
ManagedBy = "terraform"
Repository = "github.com/company/infrastructure"
DataClassification = "internal"
Compliance = "SOC2"
}
}
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t3.micro"
tags = merge(
local.common_tags,
{
Name = "web-server-${var.environment}"
Role = "web"
}
)
}
# Default tags for all resources (Terraform 1.1+)
provider "aws" {
region = "us-east-1"
default_tags {
tags = local.common_tags
}
}Integration with Existing Workflows
Pre-Commit Hooks for Local Validation
Catch issues before they reach CI/CD:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.83.5
hooks:
- id: terraform_fmt
- id: terraform_validate
- id: terraform_docs
- id: terraform_tflint
- id: terraform_tfsec
args:
- --args=--minimum-severity=HIGH
- id: terraform_checkov
args:
- --args=--framework terraform --quiet
- repo: https://github.com/trufflesecurity/trufflehog
rev: v3.63.0
hooks:
- id: trufflehog
entry: trufflehog filesystem --directory . --only-verified --fail
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- id: check-merge-conflictInstallation:
pip install pre-commit
pre-commit install
pre-commit run --all-filesIDE Integration
Real-time security feedback during development:
VS Code Extensions:
- Checkov (Bridgecrew)
- Terraform (HashiCorp)
- tfsec
Configuration (.vscode/settings.json):
{
"checkov.token": "${env:CHECKOV_API_KEY}",
"checkov.framework": "terraform",
"checkov.severity": "high",
"terraform.languageServer.enable": true,
"terraform.validate.enable": true,
"files.associations": {
"*.tf": "terraform",
"*.tfvars": "terraform"
}
}Pull Request Automation
Automatically comment on PRs with security findings:
name: PR Security Comments
on:
pull_request:
branches: [main]
jobs:
security-review:
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Run Checkov
id: checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: ./terraform
framework: terraform
output_format: github_failed_only
soft_fail: true
- name: Comment PR
uses: actions/github-script@v7
if: always()
with:
script: |
const output = `#### Checkov Security Scan π‘οΈ
${{ steps.checkov.outputs.results }}
*Triggered by: @${{ github.actor }}*
*Action: ${{ github.event_name }}*`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
});Metrics and Reporting
Security Dashboard
Track security posture over time:
#!/usr/bin/env python3
"""
Security metrics collector for IaC scans
"""
import json
import boto3
from datetime import datetime
def publish_metrics(scan_results):
"""Publish security metrics to CloudWatch"""
cloudwatch = boto3.client('cloudwatch')
metrics = []
for severity in ['CRITICAL', 'HIGH', 'MEDIUM', 'LOW']:
count = len([r for r in scan_results['failed_checks']
if r['severity'] == severity])
metrics.append({
'MetricName': f'SecurityIssues{severity}',
'Value': count,
'Unit': 'Count',
'Timestamp': datetime.now(),
'Dimensions': [
{'Name': 'Environment', 'Value': 'production'},
{'Name': 'Scanner', 'Value': 'checkov'}
]
})
# Total issues
metrics.append({
'MetricName': 'SecurityIssuesTotal',
'Value': len(scan_results['failed_checks']),
'Unit': 'Count',
'Timestamp': datetime.now()
})
# Suppression rate
total_checks = len(scan_results['passed_checks']) + len(scan_results['failed_checks'])
suppressed = len(scan_results['skipped_checks'])
suppression_rate = (suppressed / total_checks * 100) if total_checks > 0 else 0
metrics.append({
'MetricName': 'SecuritySuppressionRate',
'Value': suppression_rate,
'Unit': 'Percent',
'Timestamp': datetime.now()
})
cloudwatch.put_metric_data(
Namespace='IaC/Security',
MetricData=metrics
)
# Usage
with open('checkov-results.json') as f:
results = json.load(f)
publish_metrics(results)Compliance Reporting
Generate compliance reports for auditors:
#!/bin/bash
# generate-compliance-report.sh
REPORT_DIR="compliance-reports"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
mkdir -p "$REPORT_DIR"
# Run compliance-focused scan
checkov -d terraform/ \
--framework terraform \
--check CKV2_AWS_* \
--compact \
--output cli \
--output junitxml \
--output-file-path "console,$REPORT_DIR/compliance-$TIMESTAMP.xml"
# Generate HTML report
python3 << EOF
import json
import sys
from jinja2 import Template
with open('$REPORT_DIR/compliance-$TIMESTAMP.json') as f:
data = json.load(f)
template = Template('''
<!DOCTYPE html>
<html>
<head><title>IaC Compliance Report</title></head>
<body>
<h1>Infrastructure Compliance Report</h1>
<p>Generated: {{ timestamp }}</p>
<h2>Summary</h2>
<table border="1">
<tr>
<td>Total Checks</td>
<td>{{ summary.passed + summary.failed }}</td>
</tr>
<tr>
<td>Passed</td>
<td style="color: green">{{ summary.passed }}</td>
</tr>
<tr>
<td>Failed</td>
<td style="color: red">{{ summary.failed }}</td>
</tr>
<tr>
<td>Compliance Rate</td>
<td>{{ compliance_rate }}%</td>
</tr>
</table>
<h2>Failed Checks</h2>
{% for check in failed_checks %}
<div>
<h3>{{ check.check_id }}</h3>
<p>{{ check.check_name }}</p>
<p>Resource: {{ check.resource }}</p>
<p>File: {{ check.file_path }}</p>
</div>
{% endfor %}
</body>
</html>
''')
html = template.render(
timestamp='$TIMESTAMP',
summary=data['summary'],
compliance_rate=round(data['summary']['passed'] /
(data['summary']['passed'] + data['summary']['failed']) * 100, 2),
failed_checks=data['results']['failed_checks']
)
with open('$REPORT_DIR/compliance-$TIMESTAMP.html', 'w') as f:
f.write(html)
EOF
echo "Compliance report generated: $REPORT_DIR/compliance-$TIMESTAMP.html"Best Practices and Recommendations
1. Start Strict, Relax Selectively
Begin with all checks enabled and high severity enforcement:
- Captures baseline security posture
- Forces team to understand each suppression
- Creates security-conscious culture
2. Security as Code Review
Treat security suppressions like code changes:
- Require peer review for all suppressions
- Document justification and compensating controls
- Set expiration dates on exceptions
- Track exceptions in JIRA/Linear
3. Progressive Enhancement
Phase rollout to minimize disruption:
Phase 1: Visibility (Weeks 1-2)
- Run scans in report-only mode
- Share results in team meetings
- Build awareness of security issues
Phase 2: Soft Enforcement (Weeks 3-4)
- Fail on CRITICAL issues only
- Allow MEDIUM/LOW as warnings
- Create remediation runbook
Phase 3: Full Enforcement (Week 5+)
- Fail on CRITICAL and HIGH
- Require approvals for suppressions
- Monthly security reviews
4. Developer Education
Security scanning is most effective with educated developers:
- Weekly security tips in team channels
- "Lunch and learn" sessions on common issues
- Celebrate security improvements
- Share remediation patterns
5. Continuous Improvement
Regularly review and improve:
- Monthly review of all active suppressions
- Quarterly policy updates
- Annual security tool evaluation
- Track time-to-remediation metrics
Conclusion
Automating security checks in your IaC pipeline is not optionalβit's a fundamental requirement for modern cloud infrastructure. The investment is minimal compared to the risk of security breaches, compliance violations, or production incidents from misconfigured infrastructure.
Key Takeaways:
- Layer your defenses: Use multiple complementary tools
- Shift left: Catch issues early with pre-commit hooks and IDE integration
- Enforce selectively: Start with critical issues, expand coverage over time
- Document exceptions: Every suppression needs justification and review date
- Measure progress: Track metrics to demonstrate security improvement
- Educate teams: Security is everyone's responsibility
The tools and patterns in this guide provide a production-ready framework for IaC security. Start with basic scanning, layer in policy enforcement, and continuously improve your security posture.
Need help implementing a comprehensive IaC security pipeline for your organization? Contact us for expert guidance tailored to your tech stack and compliance requirements.