AWS VPC Fundamentals

A VPC is your own private L3 network inside AWS. For a network engineer, it’s a familiar landscape with a few AWS-specific surprises. This note focuses on the surprises — the parts where AWS networking diverges from physical networking.

What a VPC is

A Virtual Private Cloud (VPC) is an isolated virtual network in one AWS region. It’s defined by:

  • A CIDR block (e.g. 10.0.0.0/16) — your address space
  • Subnets within the CIDR, each in exactly one Availability Zone
  • Route tables controlling L3 forwarding
  • Gateways for connectivity to the outside (IGW, NAT, VPN, TGW)
  • Security groups and NACLs for filtering

The closest traditional analogue is a VRF in a data center fabric — your own routing and address space, isolated from everyone else’s.

CIDR sizing

  • You pick the primary CIDR when creating the VPC (minimum /28, maximum /16)
  • Can add secondary CIDRs later (up to 4 additional blocks)
  • AWS reserves 5 addresses per subnet (network, VPC router, DNS, future use, broadcast — yes, even though broadcast doesn’t exist)
  • Don’t overlap with other VPCs you might peer with, or with on-prem networks you’ll connect via VPN/Direct Connect. This causes painful migrations later.

Common pattern: size the VPC /16, subnets /24. Gives you 256 subnets × 251 usable hosts.

Subnets — AZ-scoped L3 segments

A subnet is a piece of the VPC CIDR, bound to exactly one AZ. You can’t stretch a subnet across AZs. This is the fundamental HA unit — to be multi-AZ, your workload must live in subnets in multiple AZs.

Public vs private subnets

The distinction is purely about routing — there’s no “public subnet” checkbox. A subnet is “public” if its route table sends 0.0.0.0/0 to an Internet Gateway. Otherwise it’s “private.”

Public subnet route table:
  10.0.0.0/16  →  local
  0.0.0.0/0    →  igw-xxxx        ← makes it "public"

Private subnet route table:
  10.0.0.0/16  →  local
  0.0.0.0/0    →  nat-yyyy        ← outbound-only via NAT

Instances in a public subnet can have public IPs and be reached from the internet. Instances in a private subnet have no direct inbound path — they reach out via NAT.

Subnet surprises for a network engineer

  • No broadcast, no multicast (except via a special Transit Gateway multicast domain)
  • No L2 visibility — you don’t see your AZ-mates on the wire; packets are routed from the first hop
  • Implicit first-hop router — the .1 address of every subnet is always the VPC router. There’s no HSRP/VRRP; it’s just always up.
  • ARP is synthetic — AWS forges ARP responses; MAC spoofing is blocked at the hypervisor
  • You can’t run protocols that need L2 adjacency (OSPF on broadcast, VRRP, classic multicast) without workarounds (GRE/VXLAN overlays, Transit Gateway Connect)

Route tables — per-subnet policy

Each subnet is associated with exactly one route table. Multiple subnets can share a table. The table controls where the VPC router forwards traffic.

Possible route targets:

TargetDestination example
localThe VPC CIDR — automatic, can’t be removed
Internet Gateway (igw-*)0.0.0.0/0 for public subnets
NAT Gateway (nat-*)0.0.0.0/0 for private subnets (outbound only)
VPC Peering (pcx-*)Another VPC’s CIDR
Transit Gateway (tgw-*)Multiple destinations
Virtual Private Gateway (vgw-*)On-prem CIDRs via site-to-site VPN
Network Interface (eni-*)For traffic steering through an NVA (firewall)

Longest-prefix-match applies, like any routing table.

Internet Gateway (IGW)

  • A VPC-attached component providing 1:1 NAT for public IPs → private IPs
  • One per VPC, horizontally scaled, no maintenance
  • Does not cost anything on its own (unlike NAT Gateway)
  • Without an IGW attached + a route to it, no traffic leaves the VPC to the internet

An instance with a public IP doesn’t actually have a public IP on its NIC — the NIC has the private IP. The IGW translates inbound/outbound. This is why ifconfig on an EC2 instance shows only the private IP.

NAT Gateway

For private subnets that need outbound internet (apt updates, API calls) without being reachable from the internet.

  • Per-AZ managed service (one NAT Gateway per AZ for true HA — don’t share across AZs or cross-AZ charges apply)
  • Charged per hour and per GB processed — expensive at scale; a common bill shocker
  • Alternatives: NAT Instance (self-managed EC2 with NAT enabled — cheaper but you babysit it), VPC Endpoints (see below) for AWS services

Security Groups vs NACLs

The two filtering layers:

Security GroupNACL
ScopePer ENI (instance-level)Per subnet
StateStateful — return traffic implicitly allowedStateless — allow both directions explicitly
ActionAllow only (implicit deny)Allow or deny, ordered rules
Rule evaluationAll rules evaluated (permissive union)First-match by rule number

Default rule of thumb: use Security Groups for almost everything. Treat NACLs as belt-and-suspenders for coarse subnet-level blocks (e.g. block a known-bad IP range at the subnet level).

See AWS Security Groups vs NACLs for the detailed comparison.

VPC connectivity options

How a VPC talks to other things:

NeedUse
Connect two VPCs (same or different accounts)VPC Peering (non-transitive, point-to-point)
Connect many VPCs + on-premTransit Gateway (hub-and-spoke, transitive)
Site-to-site to on-prem over internetSite-to-Site VPN (IPsec over internet)
Dedicated to on-premDirect Connect (private fiber circuit)
Reach AWS services from a VPC without traversing the internetVPC Endpoints (Gateway for S3/DynamoDB, Interface for others via PrivateLink)
Cross-region privateTransit Gateway peering or Cloud WAN

VPC Peering is not transitive. If A peers with B and B peers with C, A cannot reach C via B. Transit Gateway solves this at scale.

The default VPC

Every AWS account starts with a default VPC in every region: 172.31.0.0/16, one public subnet per AZ, IGW attached, public IPs auto-assigned. Convenient for learning, dangerous for production because anything you launch is internet-reachable by default. For real workloads, create your own VPCs and ignore the default one.

Flow logs — the observability layer

VPC Flow Logs capture metadata (5-tuple + action) for every flow in/out of ENIs. Written to CloudWatch Logs or S3. Essential for:

  • Security analysis (what connected where)
  • Troubleshooting (“why can’t A reach B?“)
  • Cost attribution (cross-AZ traffic analysis)

Flow logs don’t capture packet payloads — that’s a VPC Traffic Mirroring feature, separate.

Common pitfalls

  1. Overlapping CIDRs — blocks VPC peering, TGW attachment, and on-prem integration. Plan before you build.
  2. NAT Gateway billing — processing charges on outbound data add up. S3 Gateway Endpoint bypasses NAT for S3 traffic (and it’s free). Use it.
  3. SG references across VPCs — SGs can reference other SGs only within the same VPC (or peered VPCs with config). Across TGW, you use CIDRs.
  4. Subnet sizing — too small (/27), you run out of IPs; too large, you waste space and can’t split later without migration.
  5. Public IP confusion — elastic IPs persist; auto-assigned public IPs change on stop/start. For anything needing a stable public endpoint, use an EIP or a load balancer.

See also