MODULE 07 / 12crash course

~/roadmap/07-caching

◆Beginner

Caching

Q: What is cache-aside (lazy loading) and why is it the most common cache write strategy?

In cache-aside, the application checks the cache first and populates it only on a miss, then writes to the database directly and deletes or overwrites the cache entry on updates. It is the most common strategy because it is simple and robust — the cache only holds data that has actually been requested — and it is what 80% of systems use, typically paired with a TTL.

Q: What is a thundering herd (cache stampede) and how do you prevent it?

A thundering herd happens when a hot key expires and thousands of concurrent requests simultaneously miss the cache, all hitting the database at once and potentially taking it down. The cleanest fix is single-flight or request coalescing, where only one request rebuilds the cache and every other concurrent request waits on that result. Alternatives include a distributed lock using Redis SET NX EX, probabilistic early expiration, or a soft TTL that triggers a background refresh while still serving stale data.

Q: What is the difference between write-through and write-back caching?

Write-through writes to both the cache and the database synchronously before acknowledging the caller, so the cache is always fresh but every write pays the cost of two stores. Write-back writes to the cache and returns immediately, then flushes to the database asynchronously in batches, dramatically lowering write latency at the cost of potential data loss if the cache node dies before the flush completes.

Q: Why is allkeys-lru the recommended Redis eviction policy, and what does Redis ship with by default?

Redis ships with noeviction as the default, which simply returns errors when the cache is full — almost never what you want in production. Allkeys-lru evicts the least-recently-used key across the entire keyspace, which aligns with the reality that a small fraction of keys absorb most traffic. Volatile-lru is the alternative when you only want to evict keys that have a TTL set.

Q: How do you size a distributed cache, and what hit ratio signals a well-tuned cache?

Start with daily read volume in bytes, multiply by the hot data ratio (typically 20% for Zipf-like distributions), add 20-30% for Redis metadata and memory fragmentation overhead, then divide by per-instance RAM to get cluster size. For example, 10 TB per day times 20% times 1.3 overhead is roughly 2.6 TB, or about 80 instances of 32 GB. A well-sized cache sustains a hit ratio of 90% or higher; below 80% signals the cache is undersized, TTLs are too short, or evictions are too aggressive.

Cache hierarchies, write strategies, eviction policies, Redis data structures, the four classic cache pathologies, and how to size and warm a cache properly.

18 min read2026-01-21Ironclad Academy

#caching #performance #redis #fundamentals

A cache trades memory for speed. It is the single most powerful primitive in system design — the reason 99% of websites feel fast despite their backends being slow. A well-placed cache makes a system 100× faster; a badly-placed cache makes a system 100× more confusing.

This module covers where caches live, how to read/write through them, what to evict, the failure modes nobody warns you about, and the data structures (especially in Redis) that make modern caches more than just "hash maps."

Why caching works

Two facts about real systems:

Access patterns are skewed. A small number of items get most of the requests.
Compute costs more than memory. A SQL query computing aggregates costs 10–1000× more than a Redis GET.

Multiply: keep the 20% hot data in fast storage, and you serve 80% of traffic at 100× the speed for far less than 100× the cost. This is the deal caches offer, and it's why every system has at least one.

The cache hierarchy (you're already using one)

flowchart LR
    U[User Browser] -->|10ms| CDN
    CDN -->|5–30ms| LB[Load Balancer]
    LB -->|2ms| APP[App Server]
    APP -->|in-process<br/>50μs| LC[Local Cache]
    LC -->|miss → 1ms| RC[Redis Cluster]
    RC -->|miss → 10ms| DB[(Database)]
    DB -->|miss → 50ms| OS[(Object Storage)]
    style CDN fill:#ff6b1a,color:#0a0a0f
    style LC fill:#ffaa00,color:#0a0a0f
    style RC fill:#15803d,color:#fff
    style DB fill:#0e7490,color:#fff

Every layer is there because the next layer is slower. Browser cache → CDN → Local cache → Redis → Database → Object storage. A hit at any layer skips the rest.

The key property: a cache hit at layer N is roughly 10–100× faster than a hit at layer N+1.

What lives where

Layer	Examples	Latency	Best for
Browser	HTTP cache, IndexedDB	0ms	Static assets, user-specific data
Edge / CDN	CloudFront, Cloudflare, Fastly	5–30ms	Global static + edge logic
Reverse proxy	Varnish, Nginx	<1ms	API responses, rendered pages
App-local (in-process)	Caffeine (Java), LRU map	μs	Reference data, hot lookups
Distributed	Redis, Memcached	1ms	Shared session, query cache, fanout
Database buffer pool	Postgres `shared_buffers`	μs	Hot pages
OS page cache	Linux	μs	All file/disk I/O

You don't pick one — most systems use four or five of these in concert.

In-process caches

The fastest cache is the one inside your process. No network, no serialization — nanosecond access with no infrastructure to manage. The downside is that each app server has its own copy, so N servers means N copies. It doesn't survive restarts, and a write on one server won't invalidate the cache on another.

Use in-process caches for small reference data that's stable and safe to be slightly stale: feature flags, plan tiers, decoded JWT keys. Anything that has to be globally consistent belongs in a shared store instead.

Distributed caches (Redis, Memcached)

A distributed cache — Redis or Memcached — is a single shared store visible to all app instances. Because it's network-attached, you pay about 1ms per operation, and it becomes a dependency that can fail. But a single write invalidates the value for every server at once, and the cache can grow larger than any individual instance can hold.

This makes Redis the right choice for session stores, query caches, leaderboards, rate limiters that need to be globally enforced, and fanout data like social timelines.

Cache write strategies

When the data changes, what does the cache do? There are five patterns, each with a different correctness/latency profile.

Cache-aside (lazy loading) — the most common in practice

The application is responsible for both reading and writing the cache.

sequenceDiagram
    participant App
    participant Cache
    participant DB
    App->>Cache: get(k)
    alt cache hit
        Cache-->>App: v
    else cache miss
        Cache-->>App: null
        App->>DB: select where id = k
        DB-->>App: v
        App->>Cache: set(k, v, ttl)
    end

For writes:

def write(key, value):
    db.write(key, value)
    cache.delete(key)   # or set(key, value); both are common

The read is lazy — the cache only fills after the first miss. The first reader after a write sees stale or nothing; subsequent readers hit cache. This is simple, robust, and it's the default choice for most systems. The main thing to watch for is the race window between a write and the cache delete.

Write-through

sequenceDiagram
    participant App
    participant Cache
    participant DB
    App->>Cache: write(k, v)
    Cache->>DB: write(k, v)
    DB-->>Cache: ok
    Cache-->>App: ok

Write goes to both stores synchronously before acknowledging the caller. The cache is always fresh — no stale read after a write. You pay for both stores on every write, which is fine when writes are infrequent or when freshness matters more than write speed.

Write-back (write-behind)

sequenceDiagram
    participant App
    participant Cache
    participant DB
    App->>Cache: write(k, v)
    Cache-->>App: ok
    Note over Cache,DB: async, batched
    Cache->>DB: write(k, v)

Write to cache and return immediately; flush to the database later in batches. Write latency drops dramatically, and batching means fewer, larger writes to the DB. The risk: if the cache node dies before flushing, that data is gone. Reserve write-back for situations where you can afford brief data loss — telemetry pipelines, analytics counters, game state.

Write-around

The app writes directly to the database and skips the cache entirely. The cache gets populated only when a subsequent read misses. This pattern protects the cache from write-heavy data that won't be re-read soon — bulk imports, log data, cold backfills. The first read after any write will always miss.

Read-through

The cache itself fetches from the database on a miss, rather than the application. You write to the cache; it handles the DB synchronization. Common with managed caches like AWS DAX in front of DynamoDB. The application code is simpler because it only talks to the cache, but the cache and DB are now tightly coupled.

Picking a strategy

Here is a quick decision tree:

flowchart TD
    Q1{Reads or writes dominant?}
    Q1 -->|Reads| Q2{Tolerate stale on miss?}
    Q1 -->|Writes| Q3{Can you tolerate data loss?}
    Q2 -->|Yes| CA[Cache-aside]
    Q2 -->|No| WT[Write-through]
    Q3 -->|Yes| WB[Write-back]
    Q3 -->|No| WT2[Write-through]
    style CA fill:#15803d,color:#fff
    style WT fill:#ff6b1a,color:#0a0a0f
    style WB fill:#ffaa00,color:#0a0a0f
    style WT2 fill:#ff6b1a,color:#0a0a0f

If you have to pick one default, cache-aside with TTL is what 80% of systems use.

Eviction policies

When the cache fills up, what gets thrown out?

Policy	What it evicts	When it shines
LRU (Least Recently Used)	Oldest-accessed	General-purpose; popular for in-process caches
LFU (Least Frequently Used)	Lowest hit count	Long-tail traffic with stable popularity
FIFO	Oldest-inserted	Simple; rarely the right choice
TTL	Anything past expiry	Time-bounded freshness
Random	Random	Surprisingly hard to beat at high pressure
TinyLFU + LRU (Caffeine)	Hybrid: LFU admission, LRU eviction	Better hit rate than either alone

Redis ships with noeviction as the default — it returns errors when full, which is almost never what you want. You almost always want allkeys-lru or volatile-lru unless you have a specific reason.

maxmemory 8gb
maxmemory-policy allkeys-lru

Redis policy	Behavior
`noeviction`	Return errors when full
`allkeys-lru`	Evict any key by LRU
`volatile-lru`	Evict only keys with TTL by LRU
`allkeys-lfu`	Evict any key by LFU
`volatile-lfu`	Evict only keys with TTL by LFU
`allkeys-random`	Evict a random key
`volatile-ttl`	Evict the key closest to expiry

A subtle point about LRU at scale

True LRU requires keeping a sorted list of every key — expensive at scale. Redis approximates LRU by sampling N random keys and evicting the oldest among them (default N=5; tunable via maxmemory-samples). Since Redis 3.0 the algorithm also maintains a small pool of eviction candidates across sampling rounds, which makes it significantly closer to true LRU than pure random sampling. That's why "Redis LRU" is fast but not strictly LRU. In practice, the approximation is fine.

TTLs (time-to-live)

Every cache entry should have a TTL — even if "infinite is fine." TTLs serve as:

Belt-and-suspenders freshness: even if your invalidation logic has a bug, the stale entry expires.
Memory bound: cold keys quietly disappear instead of accumulating.
A safety valve: a runaway cache problem self-heals after the TTL.

Don't synchronize TTLs. If you set ttl=3600 on every key at the same time, in 1 hour they all expire simultaneously and your DB melts. Always jitter: ttl = base_ttl + random(0, base_ttl × 0.1).

The 80/20 rule

flowchart LR
    A[Top 20% of keys] -->|serve 80% of requests| B[Cache it]
    C[Bottom 80% of keys] -->|serve 20% of requests| D[Skip the cache]
    style B fill:#15803d,color:#fff
    style D fill:#6b6b85,color:#0a0a0f

Caches work because access patterns are skewed (Pareto / Zipf distribution). The "head" of the long tail makes up most of your traffic and fits in cache. The "tail" rarely repeats and is wasted memory.

This is why your cache size estimate is 0.2 × daily_read_bytes, not 1.0 × daily_read_bytes.

Sizing a cache

The math:

1. Compute daily read volume in bytes.
2. Multiply by hot ratio (typically 20% for Zipf-like distributions).
3. Add 20–30% for overhead (Redis metadata, memory fragmentation).
4. Divide by per-instance RAM (e.g. 32 GB) to get cluster size.

Example: 10 TB/day read volume × 20% hot × 1.3 overhead ≈ 2.6 TB cache → 80 instances of 32 GB.

Cache hit ratio is the metric to track. A well-sized cache hits 90%+; below 80% and you're either undersized or thrashing (TTLs too short, evictions too aggressive).

Cache problems and fixes

Four classic pathologies. If you've never run into them, you will.

1. Stale data

The cache says X, the database says Y. This happens when a write lands in the database but the old value is still sitting in the cache — your invalidation logic missed a case, or there isn't any.

The standard toolkit: set TTLs so stale entries eventually expire on their own; call cache.delete(k) explicitly after every write; subscribe to a database change feed (CDC via Debezium) and drive invalidations from there; or use versioned keys (user:42:v3) so bumping the version number makes every stale cached value effectively invisible — no delete needed.

For most apps, TTLs plus explicit delete-on-write is enough. CDC-driven invalidation is the next step when "stale read" is causing real bugs.

2. Thundering herd / cache stampede

A hot key expires. Thousands of concurrent requests miss simultaneously. They all hit the database. The database falls over.

sequenceDiagram
    participant R1 as Request 1
    participant R2 as "Request 2...10k"
    participant C as Cache
    participant DB as Database
    R1->>C: get(hot_key)
    C-->>R1: null (expired)
    R2->>C: get(hot_key)
    C-->>R2: null (expired)
    Note over R1,R2: All 10k requests proceed
    R1->>DB: SELECT ...
    R2->>DB: SELECT ...
    Note over DB: 10k simultaneous queries — DB melts
    style DB fill:#ff2e88,color:#fff

The cleanest fix is single-flight / request coalescing: only one request rebuilds the cache; every other concurrent request waits on that result instead of firing its own query. Alternatively, use a distributed lock (SET NX EX) around the rebuild so only one process wins the race. You can also use probabilistic early expiration — a random fraction of requests refresh slightly before the TTL — or maintain two TTLs: a soft one that triggers a background refresh while still serving stale, and a hard one that forces a fresh fetch.

3. Cache penetration

Requests for keys that don't exist — an attacker probing for /user/-1, for instance. Every request misses the cache, hits the database, gets nothing back, and skips the cache again. The DB takes a sustained beating from requests that will always be misses.

Three ways to stop it: cache the negative (null with a short TTL), so the second identical request is served from cache; put a Bloom filter in front of the cache so requests for definitively nonexistent keys never even reach it; or validate input before touching the cache or DB and reject obviously bad IDs at the edge.

4. Cache avalanche

A large number of keys expire at the same wall-clock moment — because you set them all to expire at midnight, for instance. At midnight the DB gets slammed.

Jitter fixes this almost entirely: ttl = base_ttl + random(0, 60s). If your cache supports it, also configure graceful degradation so the system serves stale values on cache failure rather than passing every miss through to the DB simultaneously.

Cache warming

A cold cache is dangerous. After a restart or a new deploy, every request misses, the DB melts, and the system may never recover because the load keeps the DB too busy to respond fast enough to warm the cache.

flowchart LR
    START[Cache restart] --> COLD[Cold cache]
    COLD --> MISS[All requests miss]
    MISS --> DB[(Database overloaded)]
    DB --> SLOW[Slow responses]
    SLOW --> MORE[More concurrent requests pile up]
    MORE --> DB
    style COLD fill:#ff2e88,color:#fff
    style DB fill:#ff6b1a,color:#0a0a0f

Strategies to prevent this: pre-load the most popular keys at startup with a background job that runs the top-N most-frequent queries; replay yesterday's traffic as "shadow" traffic against the new cache before promoting it; do a gradual cutover (1% of traffic → 10% → 50% → 100%) so the cache fills as you ramp; or run blue/green deploys where you warm the green cache before switching over.

For Redis specifically: Redis can preload from RDB/AOF on startup — if persistence is enabled, the cache comes back with most of its data after a restart.

Redis data structures (the killer feature)

Memcached is a hash map of string → string. Redis is a server-side data structure store. The data structures unlock entirely new patterns.

Strings (with INCR, EXPIRE, SETNX)

SET   counter 0
INCR  counter           # atomic
EXPIRE counter 60       # auto-expire
SET   lock:job1 1 NX EX 30   # acquire lock, 30s TTL

Used for: counters, distributed locks, rate limit token buckets, atomic flags.

Hashes (per-field reads)

HSET  user:42 name "ada" email "ada@example.com" age 30
HGET  user:42 name      # just one field, no full-doc fetch
HGETALL user:42

Memory-efficient for objects with many small fields. Each field can be read/written independently.

Lists (queues, recent items)

LPUSH  inbox:42 "msg1"
LPUSH  inbox:42 "msg2"
LRANGE inbox:42 0 9     # last 10 messages
LTRIM  inbox:42 0 99    # cap to 100

Used for: bounded inbox/feed, simple queues. For real queueing, prefer Redis Streams.

Sets (membership, intersection)

SADD  liked:tweet:7 user:42 user:99
SISMEMBER liked:tweet:7 user:42        # O(1) check
SINTER friends:42 friends:99           # mutual friends

Used for: tags, follower sets, deduplication.

Sorted sets (leaderboards, time-ordered)

ZADD  leaderboard 5000 "ada" 4500 "bob" 6000 "carol"
ZREVRANGE leaderboard 0 9              # top 10
ZRANK leaderboard "ada"                # rank of a user

# Time-ordered timeline:
ZADD  timeline:42 1736284800 "tweet:1" 1736284900 "tweet:2"
ZRANGEBYSCORE timeline:42 1736284800 +inf LIMIT 0 100

Possibly the most useful Redis structure. Used for: leaderboards, ranked feeds, time-windowed sliding rate limiters, priority queues.

Streams (Kafka-lite)

XADD  events * action login user_id 42
XREAD COUNT 10 STREAMS events 0
XGROUP CREATE events workergrp $          # consumer groups

Append-only log with consumer groups. Smaller scale than Kafka, much simpler to operate.

HyperLogLog (cardinality estimation)

PFADD  uniques:home user:42
PFADD  uniques:home user:99
PFCOUNT uniques:home               # ~2; ~1% error rate

Counts distinct items in constant memory (~12 KB) regardless of how many items. Used for: distinct visitor counts, cardinality of huge sets.

Bitmaps (bit-level)

SETBIT  active:2026-01-26 42 1     # user 42 was active today
GETBIT  active:2026-01-26 42
BITCOUNT active:2026-01-26          # how many users active
BITOP AND active:weekly active:2026-01-20 active:2026-01-21 active:2026-01-22 active:2026-01-23 active:2026-01-24 active:2026-01-25 active:2026-01-26    # intersection

Tiny memory footprint for huge user populations. Used for: presence tracking, A/B test cohorts, daily/weekly active flags.

Geospatial

GEOADD  drivers 73.98 40.75 "driver:42"
GEOSEARCH drivers FROMLONLAT 73.97 40.76 BYRADIUS 1 km ASC COUNT 10

Used for: ride-hailing nearby search, "find friends near me." See the Yelp-style proximity article.

CDNs (the cache at the edge)

A CDN is a globally-distributed cache that sits between the user and your origin. It serves static (and increasingly dynamic) content from the edge — a server geographically near the user.

flowchart LR
    U1[User in Tokyo] --> E1[Edge: Tokyo]
    U2[User in São Paulo] --> E2[Edge: São Paulo]
    U3[User in Berlin] --> E3[Edge: Berlin]
    E1 -.miss.-> O[Origin: Virginia]
    E2 -.miss.-> O
    E3 -.miss.-> O
    style E1 fill:#ff6b1a,color:#0a0a0f
    style E2 fill:#ff6b1a,color:#0a0a0f
    style E3 fill:#ff6b1a,color:#0a0a0f
    style O fill:#0e7490,color:#fff

What you cache at the edge:

Images, video, JS, CSS, fonts (every static asset).
Public API responses (with Cache-Control: public).
Sometimes server-rendered HTML (Vercel ISR, Next.js).

Cache-Control headers

The CDN respects what you tell it:

Cache-Control: public, max-age=31536000, immutable
   ↑ everyone may cache, for 1 year, content never changes
   (use this for hashed asset filenames)

Cache-Control: public, max-age=300, s-maxage=3600, stale-while-revalidate=86400
   ↑ browsers cache 5min, CDN caches 1hr,
     and may serve stale up to 1 day while refreshing in background

Cache-Control: private, max-age=0, no-cache
   ↑ user-specific; never share, validate on each request

Cache-Control: no-store
   ↑ don't even store. Use for credit card responses, auth tokens.

Vary: Cookie makes a cache key include the cookie value — important for sites where a logged-in user sees different HTML.

Cache hit ratio is the metric

A well-tuned CDN serves 95%+ of bytes from the edge. Below 80% means your Cache-Control is too restrictive, your URLs aren't cache-friendly, or your origin is sending different responses to "the same" request.

Watch out for query string parameters: /page?utm_source=foo and /page are different cache keys by default. Strip tracking params at the edge before caching.

Browser HTTP cache

The smallest, fastest cache — and the one users love because it makes "back" instant.

Cache-Control: public, max-age=86400
ETag: "a8c7..."

The browser stores the response. On revisit:

Within max-age: serve from local cache, no request at all.
Past max-age: send If-None-Match: "a8c7..." to validate. Server returns 304 Not Modified if unchanged.

The biggest perf win for any site: make your hashed assets immutable, max-age=31536000 — /assets/main.a8c7.js, /img/logo.b3f0.png. The hash in the filename means the URL changes when content changes, so "cache forever" is safe.

When NOT to cache

The data changes more often than it's read.
The data is so cheap to compute that the cache doesn't help.
You can't tolerate any staleness (rare in practice — usually you can).
The data is per-user and the user only sees it once (no reuse).
Strong-consistency-required reads (financial balances at the moment of a transaction).

Cache coherence (the hard part)

A single cache is easy. Many caches consistent with one DB is hard. The general patterns:

flowchart TD
    DB[(Source of truth)] --> CDC[Change feed CDC]
    CDC --> C1[Cache 1 invalidates]
    CDC --> C2[Cache 2 invalidates]
    CDC --> C3[Cache 3 invalidates]
    style DB fill:#ff6b1a,color:#0a0a0f
    style CDC fill:#15803d,color:#fff

Single source publishes invalidations to all caches (via Kafka, Redis pub/sub, or DB CDC like Debezium).
Versioned keys so a stale cache hit at most returns "old version, key is v3" → consumer rejects, refetches.
TTLs as a backstop against missed invalidations.

For most apps, TTLs plus explicit invalidation on write are sufficient. CDC-driven invalidation is the next step when stale reads become a production issue.

Worked example: caching layers for an article reader

For a system like this site:

Layer	What	TTL	Why
Browser HTTP cache	Static assets (JS, CSS)	1 year, immutable	Hashed filenames
CDN	HTML pages (with stale-while-revalidate)	1 hour, swr 1 day	Articles change rarely
Server in-process	Article metadata	5 min	Page generation
Redis	Search index	1 hour	Fuzzy search results
Postgres buffer pool	Article rows	(managed)	Always-hot pages

Most readers never touch Postgres at all. Most readers don't touch our origin. Most readers don't even fetch from the CDN — they hit their browser cache.

Things you should now be able to answer

What's the difference between write-through and write-back caching?
A hot key expires under load. What is the failure pattern and how do you avoid it?
Why is allkeys-lru usually the right Redis eviction policy?
Why does a CDN improve perceived latency more than a faster origin?
When is a cache the wrong solution?
A user updates their profile but reloads and sees the old version. What's wrong, and what are three ways to fix it?
You restart your Redis cluster and the database falls over from cold-cache traffic. What should have happened instead?
Sorted sets vs sets in Redis — when do you reach for which?

→ Next: Load balancing & proxies

// FAQ

Frequently asked questions

▸What is cache-aside (lazy loading) and why is it the most common cache write strategy?

In cache-aside, the application checks the cache first and populates it only on a miss, then writes to the database directly and deletes or overwrites the cache entry on updates. It is the most common strategy because it is simple and robust — the cache only holds data that has actually been requested — and it is what 80% of systems use, typically paired with a TTL.

▸What is a thundering herd (cache stampede) and how do you prevent it?

A thundering herd happens when a hot key expires and thousands of concurrent requests simultaneously miss the cache, all hitting the database at once and potentially taking it down. The cleanest fix is single-flight or request coalescing, where only one request rebuilds the cache and every other concurrent request waits on that result. Alternatives include a distributed lock using Redis SET NX EX, probabilistic early expiration, or a soft TTL that triggers a background refresh while still serving stale data.

▸What is the difference between write-through and write-back caching?

Write-through writes to both the cache and the database synchronously before acknowledging the caller, so the cache is always fresh but every write pays the cost of two stores. Write-back writes to the cache and returns immediately, then flushes to the database asynchronously in batches, dramatically lowering write latency at the cost of potential data loss if the cache node dies before the flush completes.

▸Why is allkeys-lru the recommended Redis eviction policy, and what does Redis ship with by default?

Redis ships with noeviction as the default, which simply returns errors when the cache is full — almost never what you want in production. Allkeys-lru evicts the least-recently-used key across the entire keyspace, which aligns with the reality that a small fraction of keys absorb most traffic. Volatile-lru is the alternative when you only want to evict keys that have a TTL set.

▸How do you size a distributed cache, and what hit ratio signals a well-tuned cache?

Start with daily read volume in bytes, multiply by the hot data ratio (typically 20% for Zipf-like distributions), add 20-30% for Redis metadata and memory fragmentation overhead, then divide by per-instance RAM to get cluster size. For example, 10 TB per day times 20% times 1.3 overhead is roughly 2.6 TB, or about 80 instances of 32 GB. A well-sized cache sustains a hit ratio of 90% or higher; below 80% signals the cache is undersized, TTLs are too short, or evictions are too aggressive.

← previous module

Storage Systems — Files, Blocks, and Objects

next module →

Load Balancers, Proxies, and Service Mesh