Relational vs NoSQL Databases

The fundamental split in the data world. Both are valid, neither is universally better, and the right choice depends on the shape of your data and the questions you ask of it.

Relational (SQL) databases

A relational database stores data in tables with rows and columns. Tables relate to each other via foreign keys. You interrogate the data with SQL — a declarative language where you say what you want, not how to get it.

Examples: PostgreSQL, MySQL/MariaDB, Microsoft SQL Server, Oracle, SQLite.

What makes them “relational”

Schema-first — you define columns and their types up front. A row must match the schema.
Relations — the “parent → child” links are first-class, enforced with foreign keys.
Normalization — breaking data into many small tables to avoid duplication. One customer record, one address record, linked by ID.
ACID Transactions — all-or-nothing, correctness-first.
SQL — the common query language that has outlived nearly every competing vision of the future since the 1970s.

Strengths

Consistency — the data you read is always the data that was most recently committed.
Joins — combining data from multiple tables in one query. The killer feature of SQL.
Mature tooling — decades of optimisation, backup tools, replication patterns, tuning knowledge.

Trade-offs

Scaling writes is hard — scaling up (bigger server) works; scaling out (many servers sharing the write load) is hard because ACID guarantees fight horizontal distribution.
Schema changes are friction — adding a column to a billion-row table can take hours.
Not great for non-tabular data — JSON, time-series, graph-like data fit awkwardly (though modern Postgres handles them surprisingly well).

NoSQL databases

“NoSQL” = “Not Only SQL.” It’s a catch-all for everything that doesn’t fit the relational model. There are really four sub-families:

1. Document databases

Store data as self-contained JSON-like documents. Each document can have different fields. No join, no fixed schema.

Examples: MongoDB, Couchbase, AWS DocumentDB
Good for: content systems, product catalogs, anything where each record has variable fields

2. Key-value stores

The simplest — a giant dictionary. Key → value. Extremely fast. Rich operations on values sometimes (lists, sets, counters).

Examples: Redis, Memcached, AWS DynamoDB (partly)
Good for: caches, session storage, simple lookups, real-time counters

3. Wide-column stores

Like a relational table but with trillions of rows and flexible columns per row. Designed for massive scale.

Examples: Apache Cassandra, ScyllaDB, Google Bigtable, HBase
Good for: time-series data, log ingestion, write-heavy workloads at huge scale

4. Graph databases

First-class nodes and edges. Queries traverse relationships efficiently.

Examples: Neo4j, ArangoDB, Amazon Neptune
Good for: social graphs, fraud detection, recommendation engines, anything where “how are these things connected?” is the main question

The CAP theorem (the big trade-off)

In a distributed system you can have two of these three, not all three:

Consistency — every read sees the most recent write
Availability — every request gets a response
Partition tolerance — the system keeps working when parts can’t talk to each other

Network partitions are unavoidable in practice, so the real choice is CP vs AP:

CP systems (most SQL, MongoDB with strong consistency) — prefer to refuse requests rather than serve stale data during a partition
AP systems (Cassandra, DynamoDB) — keep serving, accept that some reads may return older values temporarily (“eventual consistency”)

Traditional SQL chooses CP by default. Many NoSQL systems choose AP by default.

When to pick which

Pick relational if

Your data has a clear, stable schema with relations
You need transactions across multiple rows/tables (ACID Transactions)
Your queries involve joining multiple entities (“all orders from customers in Georgia with total > $500”)
You’re not at hyperscale (which is most of us)

Pick NoSQL if

Your data is document-shaped, variable, or schema-less
You need to scale writes horizontally across many nodes
Your access pattern is simple and predictable (get-by-key, time-series append)
Latency matters more than strong consistency
You have specific workload: graph, time-series, search, cache

Real-world answer: both

Most production systems use multiple databases:

Relational for core transactional data (orders, users, accounts)
Redis/Memcached for cache
Elasticsearch for search
Time-series DB (InfluxDB, TimescaleDB) for metrics
Object storage for blobs

This is called polyglot persistence.

Postgres is eating the world

Worth a special note: PostgreSQL has absorbed most of what the early NoSQL era was about:

JSON/JSONB columns for document-like data
Full-text search
Time-series via TimescaleDB extension
Geospatial via PostGIS
Array types, range types, materialised views

For many workloads that ten years ago required specialised NoSQL, Postgres is now “good enough and you already know SQL.” It’s a defensible default choice.

IT Knowledge DB

Explorer

Relational vs NoSQL Databases

Relational vs NoSQL Databases

Relational (SQL) databases

What makes them “relational”

Strengths

Trade-offs

NoSQL databases

1. Document databases

2. Key-value stores

3. Wide-column stores

4. Graph databases

The CAP theorem (the big trade-off)

When to pick which

Pick relational if

Pick NoSQL if

Real-world answer: both

Postgres is eating the world

See also

Graph View

Table of Contents

Backlinks