# Infrastructure as Code: Repeatable, Versioned Infrastructure
A data center fails. You lose a month of backups. Recovery takes 3 weeks. Cost: $2M+ in downtime.
With Infrastructure as Code (IaC): Entire infrastructure defined in Git. Disaster happens. Run one command. Identical infrastructure rebuilt in 30 minutes.
IaC treats infrastructure like code: version-controlled, reviewable, repeatable.
IaC Tools
**Terraform:** Cloud-agnostic; most popular
```hcl resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t2.micro" tags = { Name = "web-server" } }
resource "aws_rds_instance" "db" { allocated_storage = 20 storage_type = "gp2" engine = "postgres" engine_version = "14.0" instance_class = "db.t3.micro" name = "mydb" username = "admin" password = var.db_password # From .tfvars skip_final_snapshot = true } ```
**CloudFormation:** AWS-native
**CDK:** Code-based (Python, TypeScript, etc.)
IaC Benefits
1. Disaster Recovery
Infrastructure lost? Redeploy in minutes.
```bash terraform apply -var-file=prod.tfvars
# 30 minutes later: identical infrastructure rebuilt ```
**Without IaC:** Manual rebuilding; weeks; prone to errors; missing configurations
2. Reproducibility
Dev, staging, production identical.
```hcl
# Single configuration resource "aws_instance" "app" { instance_type = var.instance_type # Different per environment
# Prod: t3.large (1.4M requests/day) # Staging: t3.micro (testing) }
terraform apply -var-file=prod.tfvars # Production terraform apply -var-file=staging.tfvars # Staging ```
3. Version Control
Every infrastructure change tracked in Git.
``` Commit: "Increase database storage from 100GB to 200GB" Author: Sarah Date: 2026-05-13 Why: "Traffic increased; approaching limit"
Previous commit: "Add second RDS replica" Author: John Date: 2026-05-12
Audit trail: Who changed what, when, why. ```
4. Code Review
Infrastructure changes reviewed before deployment.
``` PR: "Upgrade RDS engine to PostgreSQL 15" Comment: "Does this require maintenance window? Users affected?" Response: "3-minute maintenance window; ran tests; safe" Approved: merge to main CI/CD: Automatic deploy ```
5. Cost Visibility
See exactly what resources cost before provisioning.
```hcl resource "aws_instance" "web" { instance_type = "t3.xlarge" # $200/month
# Change to t3.large instance_type = "t3.large" # $100/month # Cost savings: $100/month } ```
IaC Best Practices
1. State Management
Terraform tracks state (current resources deployed).
**State file:** terraform.tfstate
Contains current resource IDs, configurations
Updated after each apply
Must be backed up; loss = confusion
**Best practice:**
```hcl terraform { backend "s3" { bucket = "my-state-bucket" key = "prod/terraform.tfstate" region = "us-east-1" encrypt = true } } ```
Store state in remote backend (S3, Terraform Cloud) with encryption and versioning.
2. Modularity
Don't define everything in one file. Use modules.
``` modules/ vpc/ main.tf variables.tf outputs.tf rds/ main.tf variables.tf outputs.tf ecs/ main.tf variables.tf outputs.tf
main.tf (composition): module "vpc" { ... } module "rds" { ... } module "ecs" { ... } ```
Each module is reusable; main.tf just composes them.
3. Secrets Management
Never hardcode secrets in code.
**Bad:**
```hcl password = "MySecretPassword123" ```
**Good:**
```hcl password = var.db_password
# Store secret in:
# - Terraform Cloud
# - AWS Secrets Manager
# - HashiCorp Vault
# Reference at runtime ```
4. Staging Before Production
Apply to staging first; verify. Then apply to production.
```bash terraform apply -var-file=staging.tfvars # Test
# Verify everything OK terraform apply -var-file=prod.tfvars # Production ```
5. Plan Before Apply
Review changes before making them.
```bash terraform plan -var-file=prod.tfvars
# Shows what will be created, updated, destroyed
# Example output:
# + aws_instance.web will be created
# ~ aws_rds_instance.db will be modified (allocated_storage: "100" -> "200")
# - aws_security_group.old will be destroyed
# OK? Apply terraform apply -var-file=prod.tfvars ```
Real-World IaC Scenarios
Scenario 1: Multi-Region Disaster Recovery
Primary region: us-east-1 (database, app, CDN) Disaster region: eu-west-1 (same, standby)
```hcl module "primary" { source = "./modules/aws-region" region = "us-east-1" ... }
module "disaster" { source = "./modules/aws-region" region = "eu-west-1" ... } ```
Primary region fails? Switch DNS to disaster region. Identical infrastructure means instant failover.
Scenario 2: Cost Optimization
Startup realizes 80% of traffic comes during 9 AM - 5 PM.
```hcl resource "aws_instance" "web" { count = var.is_business_hours ? 100 : 20 # Scale down at night; scale up during day
# Cost: $8K/month (fixed) → $5K/month (business hours) + $1K/month (off-hours) }
# Scheduled via Lambda trigger:
# 9 AM: terraform apply (count = 100)
# 6 PM: terraform apply (count = 20) ```
Scenario 3: Compliance & Governance
Regulatory requirement: All databases must be encrypted, backup to separate region, multi-AZ.
```hcl resource "aws_db_instance" "compliant" { storage_encrypted = true multi_az = true backup_retention_period = 30 backup_window = "03:00-04:00" copy_tags_to_snapshot = true
# terraform apply ensures compliance automatically } ```
No manual steps; policy enforced by code.
IaC Maturity
**Level 1:** Manual infrastructure; no code
**Level 2:** Infrastructure defined in IaC; deployed manually
**Level 3:** Infrastructure in IaC; deployed via CI/CD
**Level 4:** Infrastructure + policy enforced; cost management automated
**Level 5:** Full automation; self-healing, self-optimizing
Common IaC Mistakes
1. **State file not versioned** — Loss = unable to manage 2. **No staging environment** — Test production changes on prod (risky) 3. **Over-parameterization** — Too many variables; hard to understand 4. **Secrets in code** — Exposed in Git 5. **Manual changes after apply** — IaC and reality diverge 6. **No plan review** — Accidental deletion or major change
The Bottom Line
IaC transforms infrastructure from "manually configured systems" to "code-defined, repeatable systems."
Disaster? Redeploy in 30 minutes. Cost spike? Scale down instantly. Compliance? Enforce in code.
Define infrastructure in code. Version it. Review it. Deploy it automatically. Sleep well.
Senthil Kumar
Founder & CEO
Founder & CEO of Sentos Technologies. Passionate about AI-powered IT solutions and helping mid-market enterprises advance beyond.