AWS CloudTrail Fundamentals

CloudTrail is AWS’s audit log. Every API call — console click, CLI command, SDK call, internal service action — is captured with identity, source, timestamp, and parameters. It’s the service you reach for during security incidents, compliance audits, and “who deleted our production bucket” forensics. On every AWS account, CloudTrail should be on, cross-account, and immutable.

What CloudTrail records

Every API call has a structured event like:

{
  "eventTime": "2026-04-24T02:35:00Z",
  "eventName": "TerminateInstances",
  "eventSource": "ec2.amazonaws.com",
  "awsRegion": "us-east-1",
  "userIdentity": {
    "type": "AssumedRole",
    "arn": "arn:aws:sts::123:assumed-role/AdminRole/alice",
    "accessKeyId": "ASIA..."
  },
  "sourceIPAddress": "1.2.3.4",
  "userAgent": "aws-cli/2.13.0",
  "requestParameters": { "instanceId": "i-abc" },
  "responseElements": { ... }
}

Answers the forensic questions: who, what, when, where from, to what resource, with what result.

Event types

CloudTrail captures three categories, each billed and configured separately:

TypeWhat it capturesDefault
Management eventsControl-plane ops: RunInstances, PutBucketPolicy, CreateUser, etc.✅ Free for the first copy (90-day view)
Data eventsData-plane ops: S3 GetObject / PutObject, Lambda Invoke, DynamoDB item-level ops❌ Off by default — high-volume, costs per event
Insights eventsAnomaly-detected spikes in write-management-events (e.g. 10x surge in CreateUser)❌ Opt-in

Why data events aren’t on by default: volume. An active S3 bucket can generate billions of GetObject events/day. You enable data events selectively on buckets/Lambda/DynamoDB tables that matter (critical data stores, sensitive buckets).

Trails — the delivery mechanism

A trail is the configuration that persists CloudTrail events to S3 and optionally CloudWatch Logs / EventBridge. Without a trail, you only have the 90-day Event History in the console (read-only, not exportable, management events only).

Create a trail for every account. The trail writes JSON log files to an S3 bucket (encrypted, integrity-validated).

Scope options

  • Single-region — events from one region. Rarely the right choice.
  • Multi-region (all regions) — default for new trails. Captures global and all regional events.
  • Organization trail — one trail at the Organization level; auto-applies to every member account; non-tamperable by member accounts. This is the correct default for a multi-account environment.

Log file integrity

CloudTrail can digitally sign log files — enable “log file integrity validation.” You can prove cryptographically that logs haven’t been modified. For compliance (SOC, HIPAA, PCI), this is usually required.

Every production AWS org should have, at minimum:

  1. Organization Trail, multi-region, management + insights events → writes to a dedicated log-archive account S3 bucket
  2. Log file integrity validation on
  3. S3 bucket in the log account with:
    • Bucket policy restricting access (only the log account)
    • Object Lock (WORM) for tamper-proofing
    • MFA-delete
    • KMS-encrypted
  4. Trail sending copy to CloudWatch Logs → metric filter + alarms on sensitive events (root login, policy changes, IAM user creation, CloudTrail changes themselves)
  5. Data events enabled on critical buckets / Lambda functions / DynamoDB tables — scoped, not global

Many orgs layer AWS CloudTrail Lake (a managed queryable data store, SQL-style) or ship to a SIEM (Splunk, Sentinel, Chronicle) for broader correlation.

Global vs regional service events

Most services are regional — events land in the region of the API call. A few are global and CloudTrail records their events in us-east-1 specifically:

  • IAM
  • STS
  • Route 53
  • CloudFront
  • Organizations
  • Support
  • WAF Classic

Implication: your multi-region trail captures these automatically. A single-region trail in eu-west-1 would miss IAM activity entirely — one more reason “multi-region trail” is the default.

Alerting on CloudTrail

Two common patterns:

1. CloudWatch Logs metric filter + alarm

Trail → CloudWatch Logs → Metric Filter (pattern: ConsoleLogin with "Failure")
      → Custom metric → Alarm → SNS → PagerDuty

Classic playbook alarms (from the AWS CIS Benchmark):

  • Root account usage
  • Unauthorized API calls (errorCode = AccessDenied)
  • IAM policy changes
  • CloudTrail configuration changes (tampering!)
  • Network ACL / SG changes
  • S3 bucket policy changes
  • Disabling/deletion of KMS CMKs
  • Route table changes

2. EventBridge rules

AWS API call via CloudTrail → EventBridge pattern → Lambda/SNS/Step Functions

More flexible than metric filters; can match on detailed event structure. Often the modern choice for real-time automated response.

CloudTrail Lake

A managed event data store that keeps CloudTrail events (up to 10 years) with SQL querying:

SELECT eventTime, userIdentity.arn, eventName
FROM "my-event-store"
WHERE eventName = 'DeleteObject'
  AND eventTime > timestamp '2026-04-01'
ORDER BY eventTime DESC
LIMIT 100;

Good when you want historical query without shipping to a SIEM. Pricing: per-event ingestion + scanned-data on queries.

Validating tamper-free logs

For each log file CloudTrail produces, a digest file is generated hourly (if integrity validation is on). The digest is signed with an AWS-managed key. CLI:

aws cloudtrail validate-logs --trail-arn <arn> \
  --start-time 2026-04-24T00:00:00Z

Validates hash chain and signatures. Alerts you if any log file has been altered or deleted.

What CloudTrail doesn’t see

  • Data plane traffic beyond what “data events” captures — e.g. actual SQL queries to RDS are not CloudTrail-visible
  • Network traffic inside VPCs — that’s VPC Flow Logs territory
  • Guest-OS actions on EC2 — use SSM, OSQuery, auditd, CloudWatch Agent
  • AWS support interactions with your account — separately logged

For full-picture auditing you need CloudTrail + VPC Flow Logs + Config + GuardDuty + OS-level audit, depending on compliance scope.

CloudTrail vs CloudWatch — the frequent confusion

CloudTrailCloudWatch
Primary dataDiscrete API-call eventsNumeric time-series + strings
Use forAudit, forensics, complianceMonitoring, alerting, dashboards
CardinalityMillions of distinct eventsAggregated metrics
RetentionLong-term (S3 / Lake) — yearsLimited by design

They complement. CloudTrail feeds CloudWatch Logs when you want to alarm on audit events; CloudWatch alone can’t tell you “who made this change.”

Common pitfalls

  1. No Organization trail. Member accounts can disable their own trails; attackers-first-move is usually to do so. Org trails are tamper-proof from member-account perspective.
  2. Trail logging into the same account. A compromised account can tamper with its own trail. Use a dedicated log-archive account with Object Lock.
  3. Forgetting data events on sensitive S3 buckets. Someone exfiltrates petabytes; CloudTrail says “bucket configuration didn’t change.” Data events would have recorded the GetObject calls.
  4. No alarms on CloudTrail itself. If the trail is disabled, you should get paged. Alarm on StopLogging and DeleteTrail events.
  5. KMS-encrypted S3 bucket with wrong key policy — trail writes fail silently. Monitor trail write errors.
  6. Assuming 90-day Event History = enough. It’s read-only, short, management-only. Real audit lives in a trail-to-S3.
  7. IAM events missed because of single-region trail. Multi-region or bust.

Mental model

  • CloudTrail = append-only ledger of every control-plane (and optionally data-plane) API call.
  • Trails = the storage and delivery config that makes the ledger durable and queryable.
  • Organization + log-archive account + Object Lock + KMS = the tamper-proofing recipe.
  • Metric filters / EventBridge = the real-time alarming layer on top.
  • CloudTrail Lake / SIEM = the long-term analytics layer.
  • First thing an attacker tries to kill. Design accordingly.

See also