Skip to main content

Command Palette

Search for a command to run...

Industry News

Infrastructure as Code: Repeatable, Versioned Infrastructure

13 May 202614 min readSenthil Kumar

# Infrastructure as Code: Repeatable, Versioned Infrastructure

A data center fails. You lose a month of backups. Recovery takes 3 weeks. Cost: $2M+ in downtime.

With Infrastructure as Code (IaC): Entire infrastructure defined in Git. Disaster happens. Run one command. Identical infrastructure rebuilt in 30 minutes.

IaC treats infrastructure like code: version-controlled, reviewable, repeatable.

IaC Tools

**Terraform:** Cloud-agnostic; most popular

```hcl resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t2.micro" tags = { Name = "web-server" } }

resource "aws_rds_instance" "db" { allocated_storage = 20 storage_type = "gp2" engine = "postgres" engine_version = "14.0" instance_class = "db.t3.micro" name = "mydb" username = "admin" password = var.db_password # From .tfvars skip_final_snapshot = true } ```

**CloudFormation:** AWS-native

**CDK:** Code-based (Python, TypeScript, etc.)

IaC Benefits

1. Disaster Recovery

Infrastructure lost? Redeploy in minutes.

```bash terraform apply -var-file=prod.tfvars

# 30 minutes later: identical infrastructure rebuilt ```

**Without IaC:** Manual rebuilding; weeks; prone to errors; missing configurations

2. Reproducibility

Dev, staging, production identical.

```hcl

# Single configuration resource "aws_instance" "app" { instance_type = var.instance_type # Different per environment

# Prod: t3.large (1.4M requests/day) # Staging: t3.micro (testing) }

terraform apply -var-file=prod.tfvars # Production terraform apply -var-file=staging.tfvars # Staging ```

3. Version Control

Every infrastructure change tracked in Git.

``` Commit: "Increase database storage from 100GB to 200GB" Author: Sarah Date: 2026-05-13 Why: "Traffic increased; approaching limit"

Previous commit: "Add second RDS replica" Author: John Date: 2026-05-12

Audit trail: Who changed what, when, why. ```

4. Code Review

Infrastructure changes reviewed before deployment.

``` PR: "Upgrade RDS engine to PostgreSQL 15" Comment: "Does this require maintenance window? Users affected?" Response: "3-minute maintenance window; ran tests; safe" Approved: merge to main CI/CD: Automatic deploy ```

5. Cost Visibility

See exactly what resources cost before provisioning.

```hcl resource "aws_instance" "web" { instance_type = "t3.xlarge" # $200/month

# Change to t3.large instance_type = "t3.large" # $100/month # Cost savings: $100/month } ```

IaC Best Practices

1. State Management

Terraform tracks state (current resources deployed).

**State file:** terraform.tfstate

Contains current resource IDs, configurations

Updated after each apply

Must be backed up; loss = confusion

**Best practice:**

```hcl terraform { backend "s3" { bucket = "my-state-bucket" key = "prod/terraform.tfstate" region = "us-east-1" encrypt = true } } ```

Store state in remote backend (S3, Terraform Cloud) with encryption and versioning.

2. Modularity

Don't define everything in one file. Use modules.

``` modules/ vpc/ main.tf variables.tf outputs.tf rds/ main.tf variables.tf outputs.tf ecs/ main.tf variables.tf outputs.tf

main.tf (composition): module "vpc" { ... } module "rds" { ... } module "ecs" { ... } ```

Each module is reusable; main.tf just composes them.

3. Secrets Management

Never hardcode secrets in code.

**Bad:**

```hcl password = "MySecretPassword123" ```

**Good:**

```hcl password = var.db_password

# Store secret in:

# - Terraform Cloud

# - AWS Secrets Manager

# - HashiCorp Vault

# Reference at runtime ```

4. Staging Before Production

Apply to staging first; verify. Then apply to production.

```bash terraform apply -var-file=staging.tfvars # Test

# Verify everything OK terraform apply -var-file=prod.tfvars # Production ```

5. Plan Before Apply

Review changes before making them.

```bash terraform plan -var-file=prod.tfvars

# Shows what will be created, updated, destroyed

# Example output:

# + aws_instance.web will be created

# ~ aws_rds_instance.db will be modified (allocated_storage: "100" -> "200")

# - aws_security_group.old will be destroyed

# OK? Apply terraform apply -var-file=prod.tfvars ```

Real-World IaC Scenarios

Scenario 1: Multi-Region Disaster Recovery

Primary region: us-east-1 (database, app, CDN) Disaster region: eu-west-1 (same, standby)

```hcl module "primary" { source = "./modules/aws-region" region = "us-east-1" ... }

module "disaster" { source = "./modules/aws-region" region = "eu-west-1" ... } ```

Primary region fails? Switch DNS to disaster region. Identical infrastructure means instant failover.

Scenario 2: Cost Optimization

Startup realizes 80% of traffic comes during 9 AM - 5 PM.

```hcl resource "aws_instance" "web" { count = var.is_business_hours ? 100 : 20 # Scale down at night; scale up during day

# Cost: $8K/month (fixed) → $5K/month (business hours) + $1K/month (off-hours) }

# Scheduled via Lambda trigger:

# 9 AM: terraform apply (count = 100)

# 6 PM: terraform apply (count = 20) ```

Scenario 3: Compliance & Governance

Regulatory requirement: All databases must be encrypted, backup to separate region, multi-AZ.

```hcl resource "aws_db_instance" "compliant" { storage_encrypted = true multi_az = true backup_retention_period = 30 backup_window = "03:00-04:00" copy_tags_to_snapshot = true

# terraform apply ensures compliance automatically } ```

No manual steps; policy enforced by code.

IaC Maturity

**Level 1:** Manual infrastructure; no code

**Level 2:** Infrastructure defined in IaC; deployed manually

**Level 3:** Infrastructure in IaC; deployed via CI/CD

**Level 4:** Infrastructure + policy enforced; cost management automated

**Level 5:** Full automation; self-healing, self-optimizing

Common IaC Mistakes

1. **State file not versioned** — Loss = unable to manage 2. **No staging environment** — Test production changes on prod (risky) 3. **Over-parameterization** — Too many variables; hard to understand 4. **Secrets in code** — Exposed in Git 5. **Manual changes after apply** — IaC and reality diverge 6. **No plan review** — Accidental deletion or major change

The Bottom Line

IaC transforms infrastructure from "manually configured systems" to "code-defined, repeatable systems."

Disaster? Redeploy in 30 minutes. Cost spike? Scale down instantly. Compliance? Enforce in code.

Define infrastructure in code. Version it. Review it. Deploy it automatically. Sleep well.

Senthil Kumar

Founder & CEO

Founder & CEO of Sentos Technologies. Passionate about AI-powered IT solutions and helping mid-market enterprises advance beyond.

Share this article

Want more insights?

Subscribe to the Sentos newsletter for expert perspectives on managed IT, cybersecurity, AI, and digital transformation.

Advance Beyond.