AWS EC2 Fundamentals

EC2 — Elastic Compute Cloud — is virtual machines as a service. It was AWS’s second service (launched 2006) and is still the foundation of most workloads. Understand EC2 and you understand how AWS exposes IaaS in general.

What EC2 is

An EC2 instance is a virtual machine running on AWS’s hypervisor fleet. You pick:

Instance type (CPU/RAM/network shape)
AMI (the disk image — the OS and preinstalled software)
Subnet (VPC placement → which AZ)
Security group (firewall rules)
Key pair (SSH access)
User data (first-boot script)

You pay per second (with a 1-minute minimum for most types) while the instance runs.

AMI — Amazon Machine Image

An AMI is a bootable disk image. Types:

AWS-provided — Amazon Linux, Ubuntu, RHEL, Windows Server, etc. Maintained by AWS or the OS vendor.
Marketplace — pre-built from third parties (often with licensing bundled)
Your own — you create from a running instance (CreateImage) to capture a golden state

AMIs are region-scoped. Copying to another region is a deliberate action (hours for large AMIs). For cross-region HA, you pre-copy.

Two underlying storage types:

EBS-backed — root volume is an EBS snapshot; can stop/start, survives reboot, data preserved
Instance store-backed (rare today) — root volume is ephemeral local disk; data lost on stop

Instance types are named <family><generation>.<size>:

    m 5 . large
    │ │ │  └──── size (nano, micro, small, medium, large, xlarge, 2xlarge...)
    │ │ └──────── generation (5, 6i, 7g)
    │ └────────── series (a=AMD, g=Graviton/ARM, i=Intel, n=network-optimized)
    └──────────── family

Families you’ll see most:

Family	Profile	Typical use
t	Burstable — CPU credits	Dev, small services, low-baseline workloads
m	General-purpose — balanced CPU/RAM	Default production choice
c	Compute-optimised — high CPU/RAM ratio	CPU-bound services, batch
r	RAM-optimised	Databases, caches, in-memory analytics
i / d	Storage-optimised — large local NVMe	Databases needing fast local disks
g / p / inf / trn	GPU / ML accelerators	Training, inference, CUDA workloads

Sizing hint: start smaller than you think (t3.medium/m6i.large) and resize up based on observed load. Oversizing is the #1 source of wasted spend.

Instance lifecycle

  pending → running → stopping → stopped → starting → running ...
                   ↓
                 terminated (destroyed — root EBS gone unless configured otherwise)

Running — billed for compute + EBS
Stopped — not billed for compute; still billed for EBS
Terminated — gone; root volume deleted by default (unless “Delete on Termination” was unchecked)
Rebooting — simple restart; stays on the same host

Stop/start vs reboot: stop/start moves the instance to a potentially different physical host → instance store data is lost, public IP changes (unless EIP). Reboot stays on the same host.

Key pairs and initial access

A key pair is an SSH key (or RDP credential for Windows). You create/upload it once per region; EC2 injects the public key into the AMI at first boot.

You don’t set a root password on Amazon Linux / Ubuntu AMIs — SSH via key pair only
Losing the private key means no SSH in. You’d need to stop the instance, attach root volume to another instance, and inject a new key — painful. Treat private keys with care.
Better modern option: EC2 Instance Connect (ephemeral SSH keys via IAM) or Systems Manager Session Manager (no SSH, no open port 22, full IAM control)

User data — first-boot script

A script that runs on first boot (only). Passed at launch time, retrieved by the cloud-init agent (Linux) or EC2Launch (Windows) from IMDS.

#!/bin/bash
yum update -y
yum install -y nginx
systemctl enable --now nginx

Used for:

Installing packages
Downloading config
Joining clusters
Initial bootstrapping before a config management tool (Ansible, etc.) takes over

Limits: 16 KB (base64-decoded). For anything bigger, user-data bootstraps a download.

Storage options

Storage	Lifecycle	Performance	Use
EBS (Elastic Block Store)	Persists independent of instance	High IOPS available (io2 class)	Root volumes, databases, anything needing persistence
Instance store	Ephemeral — lost on stop/terminate	Local NVMe, very fast	Scratch, temp, distributed systems with their own replication
EFS (NFS)	Separate service	Network latency	Shared multi-instance filesystem
FSx	Separate service	Depends on flavour	Lustre for HPC, Windows File Server, NetApp ONTAP
S3	Separate service	Object API only	Archives, artefacts, backups

EBS volume types you’ll see:

gp3 — general-purpose SSD; the modern default; baseline 3000 IOPS, tunable
io2 — high-durability SSD with provisioned IOPS; for databases needing >16K IOPS
st1 / sc1 — throughput-optimised HDD (sequential), cheap; for logs, data lakes
gp2 — older default; gp3 is cheaper and faster — migrate if you haven’t

Networking per instance

Every EC2 instance has at least one ENI (Elastic Network Interface):

Private IP from the subnet
Optional public IPv4 (auto-assigned or via Elastic IP)
Optional IPv6
One or more Security Groups
MAC address (rarely matters; no L2 adjacency)

You can attach additional ENIs — common for multi-homed firewalls/NVAs, or to give an instance multiple IPs.

Placement and HA concepts

AZ placement — you pick a subnet → that dictates AZ
Placement groups — hint to AWS for co-location or separation:
- Cluster — pack on same rack (low-latency HPC)
- Spread — force separate physical servers (HA for small groups)
- Partition — multiple logical partitions, each on separate infrastructure (HDFS, Kafka)
Auto Scaling Groups (ASG) — launch/terminate instances dynamically based on policies; replaces unhealthy instances; spreads across AZs automatically

HA pattern: ASG + ALB/NLB + multi-AZ subnets. That’s the canonical “web tier” on AWS.

Pricing in short

Mode	Discount	When to use
On-demand	0%	Variable workloads, learning, spiky
Savings Plans (Compute / EC2 Instance)	30-70%	Steady baseline for 1-3 years
Reserved Instances	30-70%	Pre-SP legacy; similar effect
Spot	70-90%	Interruptible workloads, batch jobs, stateless web tiers
Dedicated Hosts	—	License / compliance needs for physical isolation

A mature AWS account blends: Savings Plans for baseline, On-demand for bursts, Spot for stateless bulk.

Common pitfalls

Stopped instances still cost money via EBS. Terminate what you don’t need.
Public IPs disappear on stop/start unless you use an Elastic IP.
t2.micro/t3.micro CPU credits run out for sustained load → throttled. Not obvious from basic metrics; check “CPU credit balance” in CloudWatch.
Root volume DeleteOnTermination is on by default — terminate, volume is gone. For important data, snapshot or detach first.
Security group: allow 0.0.0.0/0 on port 22 — common bad habit. Use SSM Session Manager or restrict to your IP.
Choosing the wrong instance family — compute-bound workload on t3 or memory-heavy on c6i is a classic anti-pattern. Monitor, resize.

IT Knowledge DB

Explorer

AWS EC2 Fundamentals

AWS EC2 Fundamentals

What EC2 is

AMI — Amazon Machine Image

Instance types — the menu

Instance lifecycle

Key pairs and initial access

User data — first-boot script

Storage options

Networking per instance

Placement and HA concepts

Pricing in short

Common pitfalls

See also

Graph View

Table of Contents

Backlinks