~/articles/sql-vs-nosql

◆Beginner

SQL vs NoSQL — How to Actually Choose

A practical decision guide. When relational wins, when document wins, when neither is right, and the lazy "use Postgres" advice that's mostly correct.

12 min read2026-02-22Ironclad Academy

#databases #sql #nosql #system-design

// DEPTH

the full breakdown — requirements, capacity, evolution, trade-offs

The "SQL vs NoSQL" debate is mostly a relic of 2010. The honest, modern answer is: start with Postgres or MySQL; reach for NoSQL when you have a specific reason. This article walks through what those reasons actually are.

The default: Postgres

For 95% of new projects, Postgres is the right starting point. You get ACID transactions out of the box — no extra effort, just foreign keys, unique constraints, and rollbacks that work. A beefy single instance scales to roughly 1 TB of hot data and around 50k QPS for simple reads with the working set in RAM; complex queries or heavy write workloads will saturate it sooner, but that ceiling is further away than most teams ever reach.

When you do need to store semi-structured data, the jsonb column type gives you NoSQL-style flexibility without giving up SQL. Full-text search, geospatial queries via PostGIS, and vector similarity search via pgvector all run on the same engine — no extra infrastructure, no extra operational surface. The ecosystem of backups, observability tools, cloud-managed offerings, and decades of battle-tested replication is a quiet but real advantage.

If you don't have a specific reason to choose otherwise, choose Postgres. "I might need NoSQL someday" is not a specific reason.

When SQL stops being enough

flowchart TD
    S[Single Postgres] -->|"hot data > RAM"| RR[Read replicas]
    RR -->|"writes still bottleneck"| BIG[Bigger box]
    BIG -->|"hit machine ceiling<br/>(working set > RAM,<br/>writes saturate disk)"| SHARD[Shard]
    SHARD -->|"need cross-shard txn"| NEW[NewSQL: CockroachDB / Spanner]
    BIG -->|"workload is fundamentally<br/>NoSQL-shaped"| NS[NoSQL]
    style NEW fill:#a855f7,color:#fff
    style NS fill:#15803d,color:#fff

Postgres runs out of headroom in a few specific situations. The working set won't fit in RAM above a few TB on a single instance. Sustained write throughput for real OLTP — multi-index rows, durable commits, no batching tricks — hits a ceiling in the low tens-of-thousands of TPS on top-tier hardware; bulk loads via COPY can push higher, but random OLTP writes past ~20–50k TPS is where sharding or an LSM-based store starts looking attractive. Multi-region active-active writes with strong consistency are genuinely hard to retrofit onto a single-primary setup. And when your schema has grown large enough that an ALTER TABLE blocks production for hours, you start understanding why some teams ran away.

Each of those is a specific, observable failure mode. Notice that "it might not scale" isn't on the list.

NoSQL flavors and when each fits

Key-value stores

The idea is as simple as it sounds: one key maps to one value, no schema, no joins. Redis and DynamoDB are the most common choices; Memcached is still in production at many companies for pure caching.

Reach for a key-value store when your access pattern is almost entirely point lookups at high QPS — think session tokens, cached user profiles, feature flags, rate limit counters. Redis running in-memory can handle hundreds of thousands of operations per second on a single node; DynamoDB will scale horizontally without a ceiling you'll ever realistically hit. What you give up is anything resembling ad-hoc queries, complex relationships, or transactions that touch more than one key.

Document stores

Document stores like MongoDB, Couchbase, and Firestore store JSON documents keyed by an ID. The schema lives in the application, not the database — you can add fields without migrations, and nested structures are first-class.

The sweet spot is when each entity in your system is self-contained: a product page, a blog post, a user profile with embedded preferences. Documents map naturally to the objects your application already has in memory, which makes document stores feel ergonomic when the shape is right. The friction appears when entities start referencing each other heavily — you end up doing joins in application code, often across multiple round-trips, which quickly becomes slower and more fragile than a SQL join would have been.

One common misconception worth clearing up: the pre-v4.0 critique of MongoDB was not that it "requires eventual consistency" — single-document operations were always atomic, and write-concern settings gave strong consistency from early on. The real gap was multi-document atomicity: there was no way to update two documents in a single atomic transaction. MongoDB closed that gap in v4.0 (single-replica) and v4.2 (sharded clusters). The real reasons to prefer Postgres instead are a richer query planner, more mature tooling, and the fact that jsonb covers most "store flexible JSON" needs without introducing a separate store.

Column-family stores

Cassandra and ScyllaDB look superficially like tables but behave very differently. Data is organized into wide rows, partitioned by a key you define at schema creation time and sorted within each partition by clustering columns. The storage engine is an LSM-tree — Log-Structured Merge-tree — rather than the B-tree that Postgres and MySQL use. Understanding that difference is the key to understanding when Cassandra wins and when it loses.

flowchart LR
    subgraph LSM ["LSM-tree (Cassandra)"]
        LW[Write] --> CL[Commit log]
        CL --> MT[Memtable in RAM]
        MT --> SS["SSTable on disk\n(immutable)"]
        SS --> COMP[Compaction merges SSTables]
    end
    subgraph BTREE ["B-tree (Postgres)"]
        BW[Write] --> WAL[WAL log]
        WAL --> BP[Buffer pool]
        BP --> PAGE["B-tree page\n(in-place update)"]
    end
    style LSM fill:#a855f7,color:#fff
    style BTREE fill:#0e7490,color:#fff
    style MT fill:#15803d,color:#fff
    style CL fill:#ff6b1a,color:#0a0a0f
    style WAL fill:#ff6b1a,color:#0a0a0f

With LSM, every write is a sequential append — first to an on-disk commit log for durability, then to an in-memory memtable. When the memtable fills, it flushes to an immutable SSTable file. Background compaction merges SSTables over time. Because writes are always sequential appends, there's no such thing as "updating a row in place" — that makes writes extremely fast and avoids the random I/O that hurts B-tree stores under heavy write load.

The flip side: reads may have to consult multiple SSTables before finding the latest version of a row, which is slower than a single B-tree traversal. Bloom filters help (they tell a read "this SSTable definitely doesn't have your key"), and row caches help more. But under high update churn, compaction lag can appear as read latency spikes, and that's a real operational headache.

The practical conclusion: Cassandra is a great fit for time-series data, telemetry, logs, and event sourcing — anything append-mostly with massive sustained write rates. Its leaderless multi-master replication also makes it a natural choice for multi-datacenter active-active deployments where eventual consistency is acceptable. It's a poor fit for ad-hoc queries that don't start from the partition key, for anything requiring cross-partition transactions, and for analytics workloads (use ClickHouse or BigQuery for those). Cassandra forces you to design your schema around your query patterns — you decide the access patterns first, then design the table, not the other way around.

Graph databases

Graph stores like Neo4j, Neptune, and ArangoDB optimize for traversal queries — following edges from node to node across arbitrarily many hops. Fraud detection, recommendation engines, social network "who do you know in common," and knowledge graphs are the canonical use cases.

For these workloads, a graph database is the right tool. But the signal that you need one is very specific: your queries are inherently multi-hop traversals, and they'd be written as a chain of recursive JOINs or a walking BFS in SQL. If your data happens to have some relationships but your queries never walk more than two hops, Postgres handles it fine.

Search engines

Elasticsearch, OpenSearch, and Algolia build inverted indexes for full-text search — breaking text into tokens and mapping each token to the documents that contain it. They're excellent at ranked full-text search, faceted filtering ("show me results in category X sorted by relevance"), and log aggregation.

The one thing to be clear about: search engines are not a primary database. Elasticsearch can lose acknowledged writes on node failure when translog durability is set to async (the default in older versions), and its lack of ACID transactions makes it fundamentally unsuitable as a source of truth. The right pattern is to write to Postgres (or wherever your source of truth lives) and sync to Elasticsearch for search. They complement each other; they don't replace each other.

Time-series databases

InfluxDB, TimescaleDB (a Postgres extension), and Prometheus are built around the (timestamp, metric, value) triple. They compress time-series data efficiently and support downsampling and range aggregations as first-class operations. For metrics, sensor readings, observability, and financial tickers, they're the right tool. For general OLTP, they aren't.

Decision tree

flowchart TD
    Q1{Need ACID transactions?}
    Q1 -->|Yes| Q2{Single-region OK?}
    Q1 -->|No| Q3{Mostly key lookups?}

    Q2 -->|Yes| PG[Postgres / MySQL]
    Q2 -->|No, multi-region writes| NEW[CockroachDB / Spanner]

    Q3 -->|Yes| Q4{Hot? Need persistence?}
    Q3 -->|No| Q5{Time-series?}

    Q4 -->|Hot, no persist| MEM[Memcached]
    Q4 -->|Hot, persist| RD[Redis]
    Q4 -->|Durable KV| DD[DynamoDB]

    Q5 -->|Yes| TS[TimescaleDB / InfluxDB]
    Q5 -->|No| Q6{Massive writes,<br/>append-mostly?}

    Q6 -->|Yes| CA[Cassandra]
    Q6 -->|No| Q7{Document shape?}

    Q7 -->|Yes| MO[MongoDB]
    Q7 -->|No| Q8{Graph queries?}

    Q8 -->|Yes| NEO[Neo4j]
    Q8 -->|No| BACK[Back to Postgres + jsonb]

    style PG fill:#ff6b1a,color:#0a0a0f
    style RD fill:#15803d,color:#fff
    style CA fill:#a855f7,color:#fff

Polyglot persistence

Real systems use multiple databases, each for what it's best at:

App	Likely stack
Twitter (pre-2022)	Manhattan (tweets, users, DMs — KV) + Redis (timelines) + Elasticsearch (search) + Kafka (events)
Uber	Schemaless on MySQL (trips) + Redis (current driver locations, real-time cache) + Cassandra (location history, durable) + Pinot (analytics)
Stripe	Postgres (everything that's money) + Redis (rate limits) + Snowflake (analytics)

Each store solves one problem well. The trap is using three stores for the same problem, or using one store for everything and fighting its limitations. Twitter's timeline system lived in Redis because the access pattern was purely "give me the last 100 tweet IDs for this user" — a point lookup at enormous QPS. Stripe puts money in Postgres because money needs ACID. Neither team is dogmatic about it; they just matched the store to the shape.

The real cost of premature NoSQL

"We should use NoSQL someday, so let's start with it now" sounds like prudent future-proofing. In practice it backfires almost every time, for a simple reason: you don't yet know your access patterns, so you can't pick the right NoSQL store, let alone design the right schema for it.

What you do know is that NoSQL has real costs from day one. Without joins, you denormalize — which means writing the same data to multiple places on every update, which means consistency landmines when one of those writes fails. Without strong consistency, you get subtle bugs that only appear under concurrent load. Without the Postgres ecosystem, your backup, observability, and tooling story is thinner. Cassandra operations — compaction tuning, repair jobs, tombstone handling — have bitten many teams that chose it for a write-heavy workload and then discovered their reads were slow.

Wait. Migrate when the specific failure mode forces you to. By then you'll know the access patterns, and you can pick the right NoSQL and design the right schema.

Migration is hard but doable

When you do need to migrate — say, from Postgres to Cassandra for an IoT pipeline that's grown past what Postgres can handle — the pattern is well-trodden:

flowchart LR
    A[Postgres only] --> B[Dual-write to both]
    B --> C[Backfill history]
    C --> D[Read from both, compare]
    D --> E[Read from new only]
    E --> F[Stop writing to old]
    style B fill:#ff6b1a,color:#0a0a0f
    style E fill:#15803d,color:#fff

Dual-write both stores, backfill historical data, shadow-read from the new store and compare results until you trust them, then cut over reads and finally shut off the old writes. It takes months, not days. But it's a known path, and the fact that it's painful is actually useful — it means teams only migrate when there's a clear reason, not on a hunch.

Three mistakes worth naming

Using NoSQL because it's "webscale." NoSQL is not faster than Postgres at small-to-medium scale. It's shaped differently. You're paying complexity for a shape advantage that doesn't matter at your current scale. A well-indexed Postgres can handle a lot more traffic than people expect.

Writing SQL like it's NoSQL. Fetching one row at a time with SELECT *, doing joins in application code, ignoring the query planner. You're paying for a relational engine and using none of its power.

Picking MongoDB for relational data and fighting it with transactions. MongoDB supports ACID transactions since v4.0, but leaning on them heavily means you've chosen the wrong tool — you'd be better served by Postgres, which was designed for this. The symptoms are race conditions, unexpected inconsistencies, and hot documents with high write contention.

Justifying your choice in an interview

When the interviewer asks "why this database," the answer should cover six dimensions. What shape is the data — highly relational with lots of joins, a collection of self-contained documents, time-series triples, or pure key-value lookups? What does the hot path query look like — can you predefine it (Cassandra requires this), or do you need ad-hoc flexibility (Postgres)? What is the read/write ratio and throughput — read-heavy workloads want caching and replicas, append-mostly write-heavy workloads want LSM-tree stores, random-update write-heavy workloads still want Postgres or NewSQL. What are the consistency requirements — money and inventory need ACID, social feeds and caches are fine with eventual. What does the scale look like today and in two years — Postgres handles a lot before you need to migrate, and CockroachDB or Spanner extend that further with ACID guarantees at higher operational cost. Finally, what is the operational maturity of your team — Cassandra's failure modes require specialized on-call knowledge, and an honest answer includes whether your team can handle them.

A staff-level answer ties each choice back to something specific and observable: "We chose Cassandra here because our IoT pipeline does 2M writes/sec across 100M device partitions with no cross-partition reads, and we need multi-region active-active. If we needed joins or analytics, we wouldn't use it."

Things you should now be able to answer

Why is Postgres the right default for most projects?
When is jsonb in Postgres better than MongoDB? Does MongoDB support ACID transactions, and since when?
What storage engine does Cassandra use, and what does that mean for write vs. read trade-offs?
What workload is Cassandra a great fit for, and what is it terrible at?
Why is "we'll use NoSQL someday" a poor reason to use it now?
How do real big systems combine multiple databases? Name an example with three different stores.
What are the six axes to use when justifying a database choice in an interview?
Why does Cassandra force you to design schemas query-first rather than data-model-first?

Frequently asked questions

▸When should I use Postgres over a NoSQL database?

Postgres is the right default for roughly 95% of new projects. It handles ACID transactions, jsonb for semi-structured data, full-text search, geospatial via PostGIS, and vector search via pgvector on one engine, and a single beefy instance scales to around 1 TB of hot data and 50k QPS for simple reads. Reach for NoSQL only when you can name a specific failure mode: sustained writes past ~20-50k TPS, multi-region active-active, or a workload that is purely key-value at very high QPS.

▸Does MongoDB support ACID transactions?

Yes. MongoDB has supported multi-document ACID transactions since v4.0, extended to sharded clusters in v4.2. The real reasons to prefer Postgres are a richer query planner, more mature tooling, and the fact that jsonb covers most flexible-JSON use cases without a separate store.

▸What storage engine does Cassandra use, and what are the write versus read trade-offs?

Cassandra uses an LSM-tree (Log-Structured Merge-tree) instead of the B-tree that Postgres and MySQL use. Every write is a sequential append to a commit log and then a memtable, which makes writes extremely fast and avoids random I/O. The cost is on reads: they may have to consult multiple SSTables before finding the latest version of a row, and under high update churn, compaction lag can appear as read latency spikes.

▸What is the right workload for Cassandra, and what is it bad at?

Cassandra fits time-series data, telemetry, logs, and event sourcing — anything append-mostly with massive sustained write rates, and multi-datacenter active-active deployments where eventual consistency is acceptable. It is a poor fit for ad-hoc queries that do not start from the partition key, for cross-partition transactions, and for analytics workloads. A staff-level example from the article: an IoT pipeline doing 2M writes per second across 100M device partitions with no cross-partition reads and a need for multi-region active-active.

▸Why is Elasticsearch unsuitable as a primary database?

Elasticsearch can lose acknowledged writes on node failure when translog durability is set to async (the default in older versions), and it lacks ACID transactions, making it fundamentally unsuitable as a source of truth. The correct pattern is to write to Postgres as the source of truth and sync to Elasticsearch for search; the two stores complement rather than replace each other.

← previous

How to Approach a System Design Interview

Microservices vs Monolith — Honestly

// RELATED