IPsec VPN

The standard for site-to-site encryption at L3. A full framework, not a single protocol — which is both its strength and its operational pain.

What IPsec actually is

IPsec is four things working together:

  1. AH (Authentication Header) — integrity + authenticity, no confidentiality. Rarely used on its own.
  2. ESP (Encapsulating Security Payload) — encryption + integrity. The workhorse. Protocol number 50.
  3. IKE (Internet Key Exchange) — how peers authenticate and negotiate keys. UDP 500 (+ UDP 4500 for NAT-T).
  4. SA (Security Association) — the resulting negotiated context (keys, algorithms, SPI). Unidirectional, so a tunnel actually has two.

You almost always use ESP + IKEv2. AH is historical; IKEv1 is deprecated (but still everywhere in the wild).

Modes

ModeWhat’s encryptedUsed for
TransportPayload only; original IP header preservedHost-to-host; rare in enterprise
TunnelEntire original packet, new outer IP headerSite-to-site and remote access

Tunnel mode is the default mental model for IPsec VPN.

Packet structure (ESP tunnel mode)

[ New IP | ESP hdr | [ Original IP | TCP/UDP | Payload ]enc | ESP trailer | ESP auth ]
          ↑                                                              ↑
          SPI, seq#                                                      ICV (HMAC)
                                └── encrypted ──┘
  • SPI (Security Parameter Index) — identifies which SA to use for decryption
  • ICV — integrity check value (HMAC-SHA256 etc.)
  • The new outer IP header is what the network sees; the inner packet is invisible in transit

IKE — the handshake

IKEv1 (legacy, still common)

Two phases:

  • Phase 1 — establish a secure channel between peers (IKE SA). Two modes:
    • Main Mode — 6 messages, identity-protected
    • Aggressive Mode — 3 messages, faster, identity exposed → avoid
  • Phase 2 — negotiate the IPsec SAs that actually protect data (Quick Mode, 3 messages)

IKEv2 (RFC 7296, preferred)

Single RFC, cleaner, mandatory. Benefits:

  • 4 messages total for initial setup (vs 9 for IKEv1 main + quick)
  • Built-in NAT traversal, DPD (dead peer detection), rekeying
  • MOBIKE extension — client can change IP without re-establishing (mobile users)
  • EAP support — plug into RADIUS / certs / smart cards for remote access

Default to IKEv2 unless a peer forces you to IKEv1.

Authentication methods (during IKE)

MethodNotes
Pre-shared key (PSK)Simplest. Weak if short. Leaked = full compromise. Fine for small deployments.
RSA / ECDSA certificatesProduction-grade. Needs PKI. Revocation matters.
EAP (IKEv2 only)Remote access; lets you reuse RADIUS/AD credentials

Site-to-site vs remote access

Site-to-site

Two gateways establish a permanent tunnel. Interesting traffic (matching a crypto ACL or a route-based tunnel interface) is encrypted and sent across. Two camps:

  • Policy-based — crypto ACL defines what gets encrypted. Rigid but explicit.
  • Route-based — a virtual tunnel interface (VTI, GRE-over-IPsec, sVTI) is a normal L3 interface; you route traffic to it. Flexible; works with dynamic routing (OSPF/BGP over the tunnel). Almost always preferred today.

Remote access

Client software (or OS built-in) establishes a tunnel to a gateway. IKEv2 + EAP-MSCHAPv2 or EAP-TLS are typical. Windows/macOS/iOS all have native IKEv2 support.

NAT Traversal (NAT-T, RFC 3947)

ESP (protocol 50) has no ports → can’t cross PAT. NAT-T wraps ESP in UDP 4500. Detection happens in IKE; if NAT is between the peers, all ESP is UDP-encapsulated.

Without NAT-T, IPsec over NAT fails silently — IKE completes on UDP 500 but ESP drops. Always enable NAT-T and allow UDP 500 + UDP 4500.

Cipher choices (2025-ish sane defaults)

ComponentRecommendedAvoid
EncryptionAES-256-GCM (or AES-128-GCM)3DES, DES, AES-CBC (no AEAD)
IntegrityIntegral to GCM; else HMAC-SHA256MD5, SHA1
DH group19 (ECP-256), 20 (ECP-384), 14 (MODP-2048 minimum)Groups 1, 2, 5
PRFHMAC-SHA256MD5

GCM is an AEAD — it gives encryption and integrity in one pass, no separate HMAC. Preferred.

IPsec vs WireGuard

IPsecWireGuard
StandardisedIETF RFCs, universalRFC aspirational; de-facto
Config complexityHigh (IKE, SA, crypto maps, ACLs)Very low — static peer keys, done
Crypto agilityNegotiated per sessionFixed modern suite (ChaCha20-Poly1305, Curve25519)
MobilityMOBIKEBuilt-in (roaming just works)
Enterprise vendor supportUniversalEmerging
Kernel supportEverywhereLinux native since 5.6; BSD/Windows via userland

For greenfield remote-access / mesh → WireGuard. For enterprise interop with legacy gear and cloud providers → IPsec.

Cloud hybrid patterns

  • AWS Site-to-Site VPN — two IPsec tunnels per connection (active/active), BGP over them. IKEv1 or IKEv2 (prefer v2). Maximum aggregate throughput ~1.25 Gbps per tunnel.
  • Azure VPN Gateway — similar; SKUs differ in throughput and tunnel count.
  • AWS Transit Gateway + VPN — attach multiple site-to-site VPNs to one TGW, dynamic routing to many VPCs.
  • For > ~1 Gbps steady → Direct Connect / ExpressRoute instead of IPsec.

Common gotchas

  • MTU / fragmentation — IPsec adds ~50–80 bytes. If you don’t lower the inner MTU or enable TCP MSS clamping, you get mysterious packet loss for large flows (file transfer works for small chunks, breaks for big ones).
  • Phase 2 selector mismatch — both peers must agree on the crypto ACL / traffic selectors exactly. Off-by-one prefix = no traffic.
  • Re-keying storms — short SA lifetimes + many tunnels = CPU burn. 8 hours for Phase 1, 1 hour for Phase 2 is common.
  • Asymmetric routing over a VTI — if return traffic takes a different path, stateful firewalls in the middle drop it.
  • “Interesting traffic” hasn’t arrived yet — policy-based tunnels sit idle until something matches the crypto ACL. A ping from the right source often brings them up.

See also