Terraform: Architecture, How It Works, and Best Practices

Terraform is an open-source Infrastructure as Code (IaC) tool by HashiCorp. You describe infrastructure in HCL (HashiCorp Configuration Language), and Terraform creates or updates resources in cloud providers (AWS, Azure, GCP) or other APIs (Kubernetes, DNS, etc.) so the real world matches your configuration.

Why Terraform?

Declarative: You define the desired state; Terraform figures out create/update/delete.
Multi-cloud and multi-service: One tool and one language for clouds, Kubernetes, databases, and more via providers.
State: Terraform keeps a state file of what it manages so it can drift detection and safe updates.
Plan before apply: terraform plan shows the exact changes; you apply only after review.

Terraform architecture

Terraform has three main pieces: the core, providers, and state.

+--------------------------- TERRAFORM CORE (CLI) ---------------------------+
|  +-------------+  +-------------+  +-------------+  +-----------------+    |
|  |   Config    |  |  Graph      |  |  State      |  |  Provider       |    |
|  |   Loader    |->|  Builder   |->|  Manager    |<-|  Plugins         |    |
|  | (.tf files) |  | (dependency)|  | (read/write)|  | (AWS, Azure...) |    |
|  +-------------+  +-------------+  +------+------+  +--------+--------+    |
|         |                 |               |                   |            |
|         v                 v               v                   v            |
|  +---------------------------------------------------------------------+   |
|  |  Execution: plan (state+config -> diff) or apply (call providers)   |   |
|  +---------------------------------------------------------------------+   |
+----------------------------------------------------------------------------+
        |                        |                              |
        v                        v                              v
   Your .tf files          State (local file              Cloud / API
   (desired state)         or remote backend)            (actual resources)

Core components

Component	Role
Config loader	Reads `.tf` and `.tf.json` files, resolves variables and modules, and builds an internal representation of the desired infrastructure.
Graph builder	Builds a dependency graph of resources (e.g. subnet before VM). Plan and apply execute in an order that respects dependencies.
State manager	Reads and writes state: a mapping of your config (resource addresses) to real resource IDs and attributes. Used for drift detection and update planning.
Execution engine	For plan: compares desired (config) vs current (state), and asks providers for refresh; produces a change set. For apply: invokes provider APIs to create/update/delete resources, then updates state.

Providers

Providers are plugins that talk to a specific API (e.g. AWS, Azure, Kubernetes). Each resource type (e.g. aws_instance, azurerm_resource_group) belongs to a provider.
Terraform downloads providers at terraform init and uses them during plan and apply.
You declare which providers and versions you need; the core stays small and generic; all API logic lives in providers.

State

State holds: resource address → provider’s resource ID and stored attributes. Terraform uses it to know what it already created and what to change or destroy.
Backend is where state lives: local (a file on disk) or remote (e.g. S3 + DynamoDB, Azure Storage, Terraform Cloud). Remote backends enable team use, locking, and security (no state on laptops).

How Terraform works

1. Write configuration (`.tf`)

You define resources, data sources, variables, and outputs in HCL:

# main.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.aws_region
}

resource "aws_s3_bucket" "app" {
  bucket = "${var.project_name}-app-${var.environment}"

  tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
  }
}

2. terraform init

Downloads the provider plugins (and module sources if you use modules).
Initializes the backend (e.g. configures remote state).
Run once per working directory (and after adding providers or changing backend).

terraform init

3. terraform plan

Refreshes state: Asks providers to update state with the current reality (optional but default).
Plans: Compares config (desired) with state (current) and computes create, update, or destroy actions.
Output: Human-readable plan; no changes are made. Use this to review before apply.

terraform plan -out=tfplan

4. terraform apply

Runs the planned changes: calls provider APIs to create/update/delete resources.
Updates state after each successful change.
With -auto-approve it skips the confirmation prompt (useful in CI); otherwise Terraform asks for confirmation.

terraform apply tfplan
# or
terraform apply

5. terraform destroy

Plans and applies the destruction of all resources in state (reverse order of dependencies). Use with care; often you want to destroy only a subset (target or separate state).

End-to-end flow

You edit .tf (desired state).
terraform init ensures providers and backend are ready.
terraform plan refreshes state, diffs desired vs current, outputs the change set.
terraform apply executes the change set via providers and writes new state.
State is the source of truth for “what Terraform manages”; real infrastructure is the source of truth for “what actually exists.” Terraform reconciles the two.

Best practices

1. State management

Use a remote backend in production (e.g. S3 + DynamoDB for locking, Azure Storage, GCS, or Terraform Cloud). Avoid long-lived local state.
Enable state locking so two runs never apply at the same time (DynamoDB for S3 backend, etc.).
Never commit state with secrets or sensitive data to Git. Prefer remote state and restrict access with IAM.
Separate state per environment or layer (e.g. one state for networking, one for compute) to limit blast radius and allow different lifecycles.

# backend.tf (example: S3 + DynamoDB for AWS)
terraform {
  backend "s3" {
    bucket         = "my-company-terraform-state"
    key            = "prod/app/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

2. Use modules

Modules are reusable bundles of resources (e.g. “VPC module,” “ECS service module”). Use them to avoid copy-paste and to standardize patterns.
Compose environments from the same module with different variables (e.g. dev vs prod).
Prefer HashiCorp and community modules where they fit (e.g. terraform-aws-modules/vpc/aws) and wrap or extend them if needed.

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"

  name = "${var.project_name}-vpc"
  cidr = var.vpc_cidr

  azs             = var.azs
  private_subnets = var.private_subnet_cidrs
  public_subnets  = var.public_subnet_cidrs

  enable_nat_gateway = true
  single_nat_gateway = var.environment != "prod"
}

3. Pin provider and Terraform versions

In the terraform block, pin required_providers (and versions) so everyone and CI use the same provider behavior.
Pin Terraform version if you use CI or multiple people (e.g. required_version = ">= 1.5.0").

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

4. Variables and outputs

Use variables for environment-specific or sensitive values (region, env name, instance type). Give defaults where it makes sense; use variable validation to fail fast on bad input.
Use outputs to expose IDs, ARNs, and endpoints to other Terraform configs (via remote state) or to CI/CD.
Sensitive variables: Mark as sensitive = true so they are redacted in plan/apply log and in state; store values in env vars or a secret manager, not in .tf files.

5. Naming and structure

Resource names: Use a consistent scheme (e.g. {project}-{env}-{resource}) and tags (Environment, Project, ManagedBy) so you can identify and govern resources.
File layout: Split by concern (e.g. main.tf, variables.tf, outputs.tf, versions.tf, backend.tf) or by layer (e.g. network.tf, compute.tf). Keep modules in a modules/ directory.

6. Security (overview)

No secrets in code: Use TF_VAR_* or a secret store; mark variables as sensitive.
Least privilege: Run Terraform (and CI) with IAM/roles that have only the permissions needed for the resources you manage.
Private state: Store state in a bucket/container with encryption and access controls; use a private backend or VPC endpoints where possible.
Review plans: In CI, run terraform plan and require approval before apply; consider policy as code (e.g. Sentinel, OPA) for guardrails.

See Advanced: Security and hardening and Secure Terraform code examples below for details and ready-to-use patterns.

7. Workspaces vs separate state

Workspaces (e.g. default, dev, prod) use one backend with different state keys. They are simple but can be confusing; naming and discipline matter.
Separate directories or repos per environment with their own state (and backend config) give clear separation and are often easier to reason about for prod vs non-prod.

8. Plan and apply in CI

Run terraform fmt -check and terraform validate in CI.
Run terraform plan on every change and store the plan artifact; apply only after review (or from a protected branch). Use target or -destroy sparingly and with explicit approval.

Advanced: Security and hardening

Secrets management

Practice	Description
Never commit secrets	No passwords, API keys, or tokens in `.tf`, `.tfvars`, or state. Use environment variables (`TF_VAR_*`), a secret manager (HashiCorp Vault, AWS Secrets Manager), or CI secrets.
Mark sensitive variables	Set `sensitive = true` on variables and outputs so Terraform redacts them in logs and in `terraform plan` output.
Sensitive in state	State can contain sensitive values (e.g. DB password in a resource attribute). Always use a remote backend with encryption and strict access control; never commit state.
Provider credentials	Prefer IAM roles (e.g. EC2 instance profile, OIDC in CI) over long-lived access keys. If you use keys, inject via env vars, not files in the repo.

State security

Remote backend only in production: Use S3, Azure Storage, GCS, or Terraform Cloud with encryption at rest.
State locking: Prevent concurrent apply (e.g. DynamoDB for S3 backend). Reduces risk of state corruption and conflicting changes.
Access control: Restrict who can read/write state (IAM, RBAC). Prefer separate state per environment and least-privilege roles for CI.
Encryption: Enable server-side encryption on the state bucket/container; use KMS where available for audit and key control.
Private access: Use VPC endpoints or private connectivity to the state backend so state does not traverse the public internet.

Least privilege for Terraform execution

Scoped IAM: The identity that runs terraform apply (user or CI role) should have only the permissions required to create/update/delete the resources in your config. Avoid broad policies like *:* on a whole service.
Policy documents in Terraform: Define IAM policies and roles in Terraform and attach them to the runner; use conditions (e.g. aws:RequestedRegion, aws:SourceAccount) to tighten scope.
Separate roles per stack: Use different roles for dev vs prod so a compromise or mistake in dev does not affect production.

Supply chain and integrity

Provider version pinning: Pin required_providers with a version constraint (e.g. ~> 5.0) and run terraform init -upgrade in a controlled way. Use .terraform.lock.hcl and commit it so everyone uses the same provider binaries.
Module sources: Prefer modules from trusted registries (HashiCorp, official); pin version or commit. For private modules, use a private registry or tagged Git refs.
Verification: Terraform can verify provider checksums (in the lock file). In locked-down environments, use a private provider mirror and verify hashes.

Policy as code and scanning

Pre-apply checks: Use tools like tfsec, Checkov, or Trivy to scan .tf for misconfigurations (e.g. public S3 buckets, open security groups, unencrypted storage).
Terraform Cloud / Enterprise: Sentinel policies can enforce rules (e.g. “no instance without a tag”, “only approved instance types”) before an apply.
OPA (Open Policy Agent): Integrate with CI to evaluate Terraform plan or state against custom policies (e.g. “no new public IPs in prod”).

Secure resource defaults

Encryption: Enable encryption at rest (S3, EBS, RDS, etc.) and in transit (TLS) by default in your modules and examples.
Networking: Prefer private subnets and security groups that allow only necessary ingress/egress; avoid 0.0.0.0/0 unless required and documented.
Logging and auditing: Enable CloudTrail, flow logs, and resource-level logging where relevant so you can audit changes and investigate incidents.

Secure Terraform code examples

The following snippets show patterns to enhance security in your Terraform source.

1. Sensitive variables and validation

# variables.tf
variable "db_password" {
  description = "Database master password"
  type        = string
  sensitive   = true

  validation {
    condition     = length(var.db_password) >= 16
    error_message = "Password must be at least 16 characters."
  }
}

variable "environment" {
  type        = string
  description = "Environment name (dev, staging, prod)"

  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod."
  }
}

Use sensitive = true so the value is never printed in plan/apply or in logs.
Use validation blocks to fail fast on invalid or dangerous values (e.g. prod safeguards).

2. Remote backend with encryption and locking

# backend.tf
terraform {
  backend "s3" {
    bucket         = "my-org-terraform-state"
    key            = "prod/network/terraform.tfstate"
    region         = "ap-southeast-1"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
    kms_key_id     = "arn:aws:kms:ap-southeast-1:123456789012:key/..."
  }
}

encrypt = true enables server-side encryption; kms_key_id uses a customer-managed key for audit and control.
dynamodb_table enables state locking so concurrent applies are blocked.

3. No hardcoded credentials; use env or data source

# Bad: never do this
# provider "aws" {
#   access_key = "AKIA..."
#   secret_key = "..."
# }

# Good: credentials from environment (or IAM role if on EC2/ECS/Lambda)
provider "aws" {
  region = var.aws_region
  # access_key and secret_key omitted: use AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, or IAM role
}

# Good: fetch existing secret from a secret manager instead of passing raw values
data "aws_secretsmanager_secret_version" "db" {
  secret_id = var.db_secret_arn
}

locals {
  db_credentials = jsondecode(data.aws_secretsmanager_secret_version.db.secret_string)
}

Provider credentials from env or instance/profile; never in .tf or .tfvars committed to Git.
Use data sources (e.g. Secrets Manager, SSM Parameter Store) to pull secrets at apply time instead of TF_VAR_ when the secret already lives in a secure store.

4. Secure S3 bucket (encryption, versioning, block public access)

resource "aws_s3_bucket" "app_data" {
  bucket = "${var.project_name}-${var.environment}-data"

  tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
  }
}

resource "aws_s3_bucket_versioning" "app_data" {
  bucket = aws_s3_bucket.app_data.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_kms_key" "s3" {
  description             = "KMS key for ${var.project_name} S3 bucket encryption"
  deletion_window_in_days = 10
  enable_key_rotation     = true
}

resource "aws_s3_bucket_server_side_encryption_configuration" "app_data" {
  bucket = aws_s3_bucket.app_data.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.s3.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "app_data" {
  bucket = aws_s3_bucket.app_data.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Versioning for recovery; KMS encryption for at-rest security; public_access_block to prevent accidental public exposure.

5. Restrictive security group (least privilege)

resource "aws_security_group" "app" {
  name_prefix = "${var.project_name}-app-"
  vpc_id      = module.vpc.vpc_id
  description = "Application tier; allow only from ALB and required egress"

  ingress {
    description     = "HTTPS from ALB"
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }

  egress {
    description = "HTTPS to internet (e.g. APIs)"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${var.project_name}-app"
  }
}

Ingress only from the ALB security group (no 0.0.0.0/0); egress limited to what the app needs (here HTTPS). Tighten egress further (e.g. VPC endpoints, specific prefixes) where possible.

6. Outputs that must stay secret

output "db_endpoint" {
  value       = aws_db_instance.main.endpoint
  description = "Database endpoint"
}

output "db_password" {
  value       = aws_db_instance.main.password
  sensitive   = true
  description = "Database password (redacted in logs)"
}

Mark any output that could contain secrets as sensitive = true so it is never printed in logs or in the plan.

7. Version and lock file (reproducible, verifiable runs)

# versions.tf
terraform {
  required_version = ">= 1.5.0, < 2.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

Pin required_version and required_providers; run terraform init and commit .terraform.lock.hcl so all environments and CI use the same provider versions and checksums.

Summary

Topic	Takeaway
Architecture	Core (config, graph, state, execution) + providers (plugins) + state (mapping from config to real IDs).
Flow	init → plan (diff desired vs state) → apply (call providers, update state). State is the link between config and real resources.
Best practices	Remote state with locking; modules; version pinning; variables/outputs and sensitive handling; consistent naming and tags; no secrets in code; review plans and least-privilege IAM.
Security	No secrets in code; sensitive variables and outputs; encrypted remote state and locking; least-privilege IAM for Terraform; supply chain (pinned providers, lock file); policy/scanning (tfsec, Sentinel, OPA); secure defaults (encryption, restrictive security groups).
Secure code	Use validation and `sensitive` on variables; backend with encryption and DynamoDB lock; credentials from env or IAM; data sources for secrets; S3 encryption and public access block; restrictive security groups; sensitive outputs; pin versions and commit lock file.

Terraform lets you manage cloud and other APIs declaratively. Combining its architecture and workflow with security best practices and secure Terraform code patterns (secrets handling, state protection, least privilege, and scanning) keeps infrastructure safe and maintainable as you scale.

quyennv.com

Terraform: Architecture, How It Works, and Best Practices

Why Terraform?

Terraform architecture

Core components

Providers

State

How Terraform works

1. Write configuration (`.tf`)

2. terraform init

3. terraform plan

4. terraform apply

5. terraform destroy

End-to-end flow

Best practices

1. State management

2. Use modules

3. Pin provider and Terraform versions

4. Variables and outputs

5. Naming and structure

6. Security (overview)

7. Workspaces vs separate state

8. Plan and apply in CI

Advanced: Security and hardening

Secrets management

State security

Least privilege for Terraform execution

Supply chain and integrity

Policy as code and scanning

Secure resource defaults

Secure Terraform code examples

1. Sensitive variables and validation

2. Remote backend with encryption and locking

3. No hardcoded credentials; use env or data source

4. Secure S3 bucket (encryption, versioning, block public access)

5. Restrictive security group (least privilege)

6. Outputs that must stay secret

7. Version and lock file (reproducible, verifiable runs)

Summary

Comments

Why Terraform?

Terraform architecture

Core components

Providers

State

How Terraform works

1. Write configuration (.tf)

2. terraform init

3. terraform plan

4. terraform apply

5. terraform destroy

End-to-end flow

Best practices

1. State management

2. Use modules

3. Pin provider and Terraform versions

4. Variables and outputs

5. Naming and structure

6. Security (overview)

7. Workspaces vs separate state

8. Plan and apply in CI

Advanced: Security and hardening

Secrets management

State security

Least privilege for Terraform execution

Supply chain and integrity

Policy as code and scanning

Secure resource defaults

Secure Terraform code examples

1. Sensitive variables and validation

2. Remote backend with encryption and locking

3. No hardcoded credentials; use env or data source

4. Secure S3 bucket (encryption, versioning, block public access)

5. Restrictive security group (least privilege)

6. Outputs that must stay secret

7. Version and lock file (reproducible, verifiable runs)

Summary

Comments

1. Write configuration (`.tf`)