IaC Fundamentals

Infrastructure as Code = treat infrastructure the same way you treat application code: text files in git, code review, CI/CD, tests, rollback via revert. Every cloud resource, every VM, every DNS record, every firewall rule — declared in a file, applied by a tool. The console is for reading; the code is the source of truth.

Why IaC wins

Before IaC, infra lived in people’s memory, wikis, and the console. Consequences:

No history. “Why is this rule open?” — nobody knows; it’s been like that for years.
No repro. Dev, staging, prod diverge silently. Bugs only happen in prod.
No rollback. A bad change = a long ticket.
Snowflakes. Every server is unique. Every server is fragile. Nobody wants to replace one.

IaC fixes all four:

History → git log on the infra repo shows every change, author, reason.
Repro → the same code builds dev and prod. Differences are explicit.
Rollback → git revert + terraform apply. Minutes, not days.
Cattle, not pets → any server is disposable; recreate it from code.

The two approaches — declarative vs imperative

The central split. See Declarative vs Imperative Automation for the deep version.

Declarative — “what”

You describe the desired state; the tool figures out how to get there.

# Terraform
resource "aws_instance" "web" {
  ami           = "ami-0abcdef"
  instance_type = "t3.small"
  tags = { Name = "web-01" }
}

Run it once → instance is created. Run it again → nothing happens (state matches). Change t3.small to t3.medium → tool computes the diff, applies only that. This is the idempotence (Idempotence) property that makes declarative IaC safe.

Examples: Terraform / OpenTofu, Pulumi, CloudFormation, Bicep, Kubernetes YAML, Ansible (with well-written modules).

Imperative — “how”

You describe the steps; the tool runs them.

aws ec2 run-instances --image-id ami-0abcdef --instance-type t3.small

Run twice → two instances. The user is responsible for the “is it already done?” check. Brittle, but fine for one-off operational tasks (restart this service, drain this node) that aren’t trying to model steady-state.

Examples: plain shell scripts, aws CLI, kubectl create, one-off Ansible command: tasks.

Rule of thumb: declarative for desired-state infra; imperative only for operational actions where “run twice” has a meaningful second effect (e.g. “restart the database”).

The IaC tool landscape

Two big layers: provisioning (create cloud resources) and config management (set up what’s inside VMs). Modern stacks often use one tool per layer.

Provisioning tools

Tool	Language	Scope	Notes
Terraform	HCL	Multi-cloud, multi-provider	The de-facto standard. HashiCorp changed license → community fork OpenTofu (drop-in compatible)
OpenTofu	HCL	Multi-cloud	MPL-licensed fork of Terraform; functionally equivalent
Pulumi	TypeScript / Python / Go / C#	Multi-cloud	”IaC in a real programming language”
AWS CloudFormation	YAML / JSON	AWS only	First-party, deep AWS integration
Azure Bicep	Bicep DSL	Azure only	Nicer front-end to ARM templates
Google Deployment Manager	YAML	GCP	Largely superseded by Terraform + Config Controller
Crossplane	Kubernetes CRDs	Multi-cloud	IaC through Kubernetes; GitOps-native
CDK (AWS) / CDKTF	TypeScript/Python → CFN/Terraform	Multi-cloud (via TF)	Generates templates from code

If you’re starting today and cloud-agnostic: Terraform or OpenTofu. If single-cloud and all-in: cloud-native (CloudFormation / Bicep) is also fine.

Config management tools

Tool	Model	Notes
Ansible	Agentless, SSH / WinRM; YAML playbooks	Easiest to start; see Ansible Fundamentals
Puppet	Agent + master; declarative DSL	Long history, enterprise, declining
Chef	Agent + master; Ruby DSL	Acquired by Progress; declining
Salt	Agent or agentless; YAML + Python	Event-driven, fast at scale
cloud-init	First-boot script on cloud VMs	Pair with Terraform for “bootstrap + hand off”
Packer	Builds golden images	Reduces config mgmt footprint at runtime

Modern preference: bake images with Packer + minimal cloud-init, then run as immutable containers / VMs. Ansible fills the gaps.

Terraform in one screen

The one tool you most need to know in 2026 IaC.

┌──────────────────────────────────────────────────────┐
│                                                      │
│    .tf files  ───► terraform plan ───► diff          │
│         │                                            │
│         │          terraform apply ───► cloud API    │
│         │                    │                       │
│         │                    ▼                       │
│         │             remote state                   │
│         │          (S3 + DynamoDB lock,              │
│         ▼           or Terraform Cloud)              │
│    git repo                                          │
│                                                      │
└──────────────────────────────────────────────────────┘

The core loop

terraform init          # download providers + configure backend
terraform fmt           # canonical formatting
terraform validate      # syntax/schema check
terraform plan          # show what will change
terraform apply         # do it (prompts for confirmation)
terraform destroy       # tear it down

Key concepts

Resource — a cloud object (aws_instance, azurerm_virtual_network). Declared in .tf.
Provider — plugin that talks to a platform (aws, azurerm, google, kubernetes, cloudflare, github, datadog, 500+ more).
Data source — read an existing resource without managing it (data "aws_vpc" "default" { default = true }).
Variable — input (variable "region" { default = "us-east-1" }).
Output — export a value to other modules or consumers.
Module — reusable bundle of resources, like a function. Can be local or pulled from the registry.
State — JSON file mapping declared resources to real-world IDs. Critical and sensitive.
Backend — where state lives (local, S3+DynamoDB, GCS, Azure Storage, Terraform Cloud). Prefer remote with locking.
Workspace — multiple state files from the same config (rarely the right abstraction for env separation — prefer separate configs).

A minimal module use

# main.tf
terraform {
  required_version = ">= 1.6"
  required_providers {
    aws = { source = "hashicorp/aws", version = "~> 5.0" }
  }
  backend "s3" {
    bucket         = "acme-tfstate-prod"
    key            = "networking/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "tfstate-locks"
    encrypt        = true
  }
}
 
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.5.0"
 
  name            = "prod"
  cidr            = "10.0.0.0/16"
  azs             = ["us-east-1a", "us-east-1b", "us-east-1c"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
  enable_nat_gateway = true
}

State is sacred

The state file maps declared resources to the real ones. It has:

Secrets (DB passwords, private keys that were ever terraform apply-set).
The single source of truth for “who owns this resource.”
Metadata the cloud API doesn’t persist.

Rules:

Store remotely — S3 + DynamoDB lock, GCS, Terraform Cloud. Never local for shared projects.
Encrypt at rest.
Restrict access — state bucket policy limits who can read it.
Lock it — concurrent apply on the same state corrupts it. DynamoDB lock / TFC lock handles this.
Never edit by hand — use terraform state rm, terraform import, terraform state mv.
Back it up. Versioned S3 = free state history.

If state is lost: every resource appears “new,” and re-applying would create duplicates. Painful to recover via terraform import.

The testing story

IaC testing is younger than app testing but maturing fast:

Layer	Tools
Static checks	`terraform validate`, `tflint`, `terraform-docs`
Policy / compliance	OPA / Conftest, Sentinel, Checkov, tfsec, Trivy
Unit	Terratest (Go), `terraform test` (built-in HCL test framework)
Integration	Spin up a scratch AWS account, apply + assert + destroy
Drift detection	`terraform plan` in CI on a schedule; alert on non-zero diff

Minimum bar: fmt, validate, tflint, and at least one policy tool (Checkov catches “open security group to 0.0.0.0/0” style mistakes) on every PR.

Multi-environment — the repo layout question

Three common patterns:

1. Directory per env

infra/
├── modules/
│   ├── vpc/
│   └── app/
├── envs/
│   ├── dev/    # calls modules with dev inputs, dev state
│   ├── staging/
│   └── prod/

Clear separation, different backends per env. Preferred.

2. Terraform workspaces

One config, terraform workspace select prod. State-switch via CLI. Fragile — one wrong switch and you apply dev changes to prod. Not recommended for env separation.

3. Terragrunt

A wrapper that DRY’s Terraform config across envs. Popular at scale. Steeper learning curve.

IaC in a CI/CD pipeline

Typical pipeline for an infra change:

PR opened
  ↓
terraform fmt / validate / tflint / tfsec / checkov
  ↓
terraform plan  ──► post plan to PR as comment
  ↓
human review  ──► approve
  ↓
merge
  ↓
terraform apply  ──► on main, gated by env (dev → staging → prod)

Critical: never apply from a laptop. Apply runs in CI with OIDC-federated creds, not long-lived keys.

Anti-patterns

Clicking in the console, then “importing.” Do it occasionally; don’t make it a habit. Drift builds up.
Hard-coding secrets in .tf. Use variable { sensitive = true } + env vars, or fetch from KMS/Vault at runtime.
One giant state file for all infra. Blast radius = everything. Split by concern / team / lifetime (network, shared, per-app).
Cross-state data reads without contracts. If app reads output from network state, pin the version or use explicit contracts. Otherwise changes in one break the other silently.
allow_destroy = false everywhere → then -target hacks. Use prevent_destroy on critical resources (prod DB, prod VPC) intentionally.
Writing a giant monolithic module. Modules are for reuse; if this is used once, just put it inline.

When NOT to use IaC

One-off experiments. Click in the console; delete when done. IaC has overhead.
Fully managed PaaS where there’s nothing to declare (e.g. some SaaS dashboards).
App-layer state that the application manages (DB rows, queue messages). IaC manages resources, not data.

IT Knowledge DB

Explorer

IaC Fundamentals

IaC Fundamentals

Why IaC wins

The two approaches — declarative vs imperative

Declarative — “what”

Imperative — “how”

The IaC tool landscape

Provisioning tools

Config management tools

Terraform in one screen

The core loop

Key concepts

A minimal module use

State is sacred

The testing story

Multi-environment — the repo layout question

1. Directory per env

2. Terraform workspaces

3. Terragrunt

IaC in a CI/CD pipeline

Anti-patterns

When NOT to use IaC

See also

Graph View

Table of Contents

Backlinks