Design Uber / Lyft (ride hailing)
Match drivers to riders in real time at city scale. Geohashing, dispatch algorithms, surge pricing, and the realtime location pipeline.
The problem
Uber launched in San Francisco in 2010 with a dead-simple premise: tap a button and a car arrives. By 2024, the platform was handling roughly 28–31 million trips a day across 70+ countries, with around 161 million monthly active users. Lyft runs the same model in North America. At small scale this is a weekend project — a database table of drivers, a query that finds the nearest one, done. At real scale it is one of the harder distributed systems problems in consumer software.
The core challenge is matching under movement. Millions of drivers are continuously moving, each emitting a GPS ping every few seconds. A ride request arrives and the system must find the right driver — not just the closest one, but the one who will actually arrive soonest given current traffic — within a latency budget riders don't consciously notice. Then it must claim that driver atomically so two simultaneous requests don't both get assigned to the same car. And it must do all of this while ingesting roughly 400k–500k location updates per second.
That write throughput is what kills the obvious approaches. A single relational database saturates. A naive full-table scan of driver coordinates is O(N) on every request. The solution is a spatial index in memory — a geohash or hexagonal grid — that turns "find drivers near this point" from a scan into a handful of key lookups. The rest of the architecture fans out from that core insight: location in Redis, dispatch as a dedicated service with an atomic claim step, surge pricing as a streaming control loop, and a Kafka-backed trip lifecycle.
The two tensions that run through the whole design are write throughput versus query speed (you have to keep the geo index fresh without making every read wait for a write) and atomic coordination at sub-second latency (claim one driver across potentially hundreds of competing requests without a slow distributed lock). Every major design decision in this article is an answer to one of those two problems.
Functional requirements
- Riders open the app, see nearby drivers, request a ride.
- Drivers receive ride requests, accept/decline.
- Once matched, both see live location of the other.
- Trip ends, fare calculated, payment processed.
- Surge pricing in busy areas.
Non-functional
- p99 dispatch < 500ms.
- 99.99% availability — ride hailing failures = stranded customers.
- ~161M MAU globally (Q3 2024); ~28–31M rides/day.
- Realtime location updates from millions of moving drivers.
Capacity
| Dimension | Estimate | How we got there |
|---|---|---|
| Monthly active users | ~161M globally | Uber Q3 2024: 161M MAPCs, 13% YoY growth |
| Active drivers (global, incl. couriers) | ~5–6M total; ~2M online at peak | Uber published figures; peak assumes ~35–40% of fleet online simultaneously |
| Location update rate (peak) | ~450k writes/sec | 2M drivers ÷ 4.5s avg ping interval |
| Rides per day | ~28–31M/day | Q1 2024 ~28M/day, Q3 2024 ~31M/day |
| Dispatch rate — average | ~360 rides/sec | 31M ÷ 86,400s |
| Dispatch rate — peak | ~1,500 rides/sec | ~4× average; evening rush in dense markets |
| Dispatch latency budget | must beat ~10–30s patience | Rider patience ~10s; driver offer window ~15–30s |
Takeaway: the location-write throughput (~450k/sec) is the dimensioning constraint — it rules out any single relational database and mandates an in-memory geo index.
Building up to the design
Uber is the most geometric question in the catalog. The hard part isn't the API — it's "given a moving cloud of drivers, find one quickly and don't double-book." The naive version is doable in a hackathon weekend; production-Uber takes years. Walking the path makes each layer earn its keep.
V1: One table, brute-force scan
CREATE TABLE drivers (id INT, lat FLOAT, lng FLOAT, online BOOL);
Driver app pings UPDATE drivers SET lat=?, lng=? WHERE id=? every few seconds.
On a ride request:
SELECT id FROM drivers WHERE online
ORDER BY (lat-?)^2 + (lng-?)^2 LIMIT 5;
This works at neighborhood scale — every problem reduces to SQL and you have a demo in a weekend. But that ORDER BY is an O(N) full sort on every dispatch. At 100k drivers online, every ride request scans the entire table. And once you're doing 450k location updates per second, a single Postgres write-ahead log becomes the bottleneck long before the query plan does.
V2: Geohash index
Encode (lat, lng) into a string prefix where shared prefixes mean spatial closeness. Index on the geohash.
nearby_drivers(lat, lng) → all drivers with same 6-char geohash prefix → ~1.2km × 0.6km cell
Candidate lookup drops from O(N) to a single index range scan — a huge win. But cities aren't uniform. A 6-char geohash in midtown Manhattan might hold 1,000 drivers; one in rural Wyoming has zero. More importantly, writing every location update to Postgres still saturates the disk. A geo index tells the database where to look, not how fast to write.
V3: Move location state to Redis
Store live driver positions in Redis. Either GEOADD (Redis's native geo index, supporting GEORADIUS queries in ~1ms) or sharded hash maps keyed by geohash bucket. Persist to the durable DB only on meaningful state transitions — came online, went offline, started a trip.
400k+ location writes per second land comfortably in Redis. Proximity queries drop under 1ms. Postgres is no longer in the hot path at all. What's left: dispatch is still naive. "Pick the closest 5, try driver #1, time out, try #2" is slow and produces bad UX. Worse, two riders requesting at the same instant can both receive an offer to the same driver.
V4: Dispatch service with atomic claim
A dedicated Dispatch Service does four things in sequence:
- Query the geo index for candidate drivers within the relevant cells.
- Rank them by predicted ETA — not raw distance, because traffic matters.
- Atomically claim the top candidate with
SET driver:X:offered ride:Y NX EX <ttl>— a single Redis command that sets the key only if it doesn't exist and expires it automatically. Exactly one caller wins; the other immediately moves to its next candidate. - Send the offer. On accept, transition to trip. On decline or timeout, retry down the ranked list.
This eliminates double-booking and keeps dispatch latency under 500ms for the non-driver-wait portion. What it can't fix is when rider demand simply exceeds driver supply — and that's when drivers learn there's money sitting in the app.
V5: Surge + supply/demand balance
A streaming pipeline (Kafka feeding an aggregator) measures the demand/supply ratio per geo cell every 30–60 seconds. When a cell goes imbalanced, a surge multiplier raises prices — showing riders "1.4× surge" and showing nearby off-duty drivers a "hot zone" indicator. In practice, ML models predict imbalance before it fully develops and apply surge proactively rather than reactively.
V6: Production Uber
V3 + V4 + V5 plus a trip lifecycle service, payments, a full ETA service with map data and live traffic, push/WebSocket notification channels, and Kafka-fed analytics across all of it.
flowchart LR
V1[V1: SQL scan<br/>100 drivers] --> V2[V2: + geohash<br/>fast proximity]
V2 --> V3[V3: + Redis live state<br/>400k+ writes/sec]
V3 --> V4[V4: + dispatch service<br/>no double-book]
V4 --> V5[V5: + surge pricing<br/>balance]
V5 --> V6[V6: + trip + payments + ETA]
style V1 fill:#0e7490,color:#fff
style V3 fill:#15803d,color:#fff
style V4 fill:#ff6b1a,color:#0a0a0f
style V6 fill:#a855f7,color:#fff
The rest of the article zooms in on V3–V6 — what it actually takes to run this.
High-level architecture
flowchart TD
DRV[Driver app] -.location pings.-> LP[Location Pipeline]
RID[Rider app] -->|"request ride"| API[API Gateway]
API --> DISPATCH[Dispatch Service]
LP --> GEODB[(Geo Index<br/>H3 cells / Redis)]
DISPATCH --> GEODB
DISPATCH --> ETA[ETA Service]
DISPATCH --> SURGE[Surge Service]
DISPATCH --> DRV2[Driver app<br/>via Push / WebSocket]
DRV2 --> ACCEPT[Driver accepts]
ACCEPT --> TRIP[Trip Service]
TRIP --> RID2[Rider notified]
TRIP --> PAY[Payments]
TRIP --> KAFKA[Kafka events]
KAFKA --> ANA[Analytics]
style DISPATCH fill:#ff6b1a,color:#0a0a0f
style GEODB fill:#15803d,color:#fff
style LP fill:#0e7490,color:#fff
Storing locations efficiently
With ~2M drivers online at peak, each pinging every 4–5 seconds, you're looking at ~450k writes per second. That write rate will saturate any single relational database — and a naive ORDER BY distance on raw lat/lng still requires a full scan even with a btree index on the coordinate columns. You need a spatial index.
Geohash
Encode a (lat, lng) into a string where common prefixes mean geographic proximity.
(37.775, -122.418) → "9q8yyk"
(37.776, -122.419) → "9q8yyk" ← same prefix
(34.052, -118.243) → "9q5ctr" ← different prefix (LA vs SF)
Each character of the geohash adds precision by halving the bounding box in alternating longitude/latitude. A 6-character geohash covers a rectangular cell roughly 1.2 km wide × 0.6 km tall — a reasonable first-pass radius for "show me drivers near here." For denser cities, query the cell plus its 8 neighbors to avoid edge effects when a driver is near a cell boundary.
flowchart TD
G[Earth: lng -180..180, lat -90..90] --> L[First bit: lng < 0 or ≥ 0]
L --> LL[West half]
L --> LR[East half]
LR --> LR1[Second bit: lat < 0 or ≥ 0]
LR1 --> LR2[Each base-32 character adds 5 bits<br/>→ shrinks bounding box area ~32× per char]
style LR2 fill:#ff6b1a,color:#0a0a0f
Hexagonal grid: Uber's H3
Uber actually uses H3, a hexagonal grid library they open-sourced. Hexagons tile the sphere with uniform neighbor distance — every hexagon's center is equidistant from all 6 neighboring hexagon centers, a property squares don't have. H3 has 16 resolution levels (0–15), ranging from continent-size cells (~4M km²) down to ~1 m².
For ride dispatch, resolution 9 is a common choice: each cell is ~0.1 km² with an average edge length of ~174m. That's roughly 2–3 city blocks — small enough to be sparse (a few drivers per cell in most areas) but large enough that "query this cell + 6 neighbors" always captures candidates within ~475m. At finer resolutions you'd query more neighbors; at coarser resolutions the candidate list gets too large to rank cheaply.
Each driver's last location is stored in (h3_cell_at_resolution_9, driver_id). Lookup "drivers in this cell + 6 neighbors" fetches 7 Redis keys — each a sorted set — and merges the results.
The location pipeline
Drivers don't write directly to Redis. Their location pings go through a pipeline that handles batching, deduplication, and the TTL writes.
flowchart LR
APP[Driver app<br/>4-5s ping] --> INGEST[Location Ingest<br/>service]
INGEST --> KAFKA[(Kafka<br/>location topic)]
KAFKA --> WORKER[Location Worker<br/>fan-out]
WORKER --> GEO[(Geo Index<br/>Redis sorted sets)]
WORKER --> HIST[(Location History<br/>Cassandra)]
GEO -.TTL 30s.-> GEO
style INGEST fill:#0e7490,color:#fff
style KAFKA fill:#a855f7,color:#fff
style GEO fill:#15803d,color:#fff
style HIST fill:#ffaa00,color:#0a0a0f
Location events land in Kafka first. Workers consume them and write to two places: the Redis geo index (current position, with a 30-second cell-level TTL so inactive cells automatically age out) and a Cassandra table for historical location data (used for trip replay, compliance, and driver earnings disputes). The Kafka buffer also smooths out spikes — if there's a momentary surge in driver pings, the workers drain the queue rather than hammering Redis directly.
Where do we store this?
A Redis sorted set per cell, keyed by cell_id, with a cell-level TTL (the EXPIRE applies to the entire key, so all drivers in a cell expire together — stale cells age out automatically):
ZADD geo:cell:8a283082 <ts> driver_42
EXPIRE geo:cell:8a283082 30
# Note: ZADD + EXPIRE are two separate commands — a crash between them leaves
# the key without a TTL (same race as SETNX+EXPIRE). In production, wrap both
# in a Lua script (EVAL) or use a pipeline with error handling.
Or specialized: Uber historically used Ringpop (now deprecated) — an in-memory consistent-hash ring library that sharded the geo index across a cluster of nodes, with each node owning a set of cells. The pattern is sound regardless of the specific library: shard cell ownership across a cluster, keep the index in memory, rely on a membership protocol (SWIM / gossip) for node discovery.
Dispatch algorithm
sequenceDiagram
participant Rider
participant Dispatch
participant Geo
participant Driver
Rider->>Dispatch: request ride at (lat, lng)
Dispatch->>Geo: drivers within 5 km?
Geo-->>Dispatch: list of ~20 candidates
Dispatch->>Dispatch: filter (vehicle type, rating, ETA)
Dispatch->>Dispatch: rank by ETA + rider/driver score
loop for top candidates
Dispatch->>Driver: ping (accept window ~15s)
alt accept
Driver-->>Dispatch: ✓
Dispatch-->>Rider: matched!
else decline / timeout
Driver-->>Dispatch: ✗
end
end
Real systems don't just hand the request to the closest driver. They batch several recent ride requests together with nearby drivers and solve a small bipartite assignment problem: riders on one side, drivers on the other, edge weights equal to predicted ETA. The Hungarian algorithm (O(n³)) finds the minimum-weight perfect matching. Uber accumulates requests in ~1–2 second windows, so the graph is small — tens of riders × tens of drivers per cell — and the solve finishes well under 100ms. The payoff is that collective wait time across all riders drops versus greedy nearest-first dispatch.
Two other adjustments matter: drivers who decline too often are penalized with fewer offers (so gaming the system by waiting for shorter trips is costly), and dispatch ranks by ETA rather than euclidean distance — a driver across a bridge is closer in kilometers but farther in minutes. ETA comes from the routing engine described below.
ETA service
For both dispatch ("which driver gets there fastest?") and rider UX ("ETA is 4 minutes"), you need realtime traffic-aware routing.
flowchart LR
REQ[ETA request] --> ROUTER[Routing engine]
ROUTER --> GRAPH[(Road graph<br/>OSM-derived)]
ROUTER --> TRAFFIC[(Traffic data<br/>realtime)]
ROUTER --> ML[ML predictor<br/>this segment historically takes X]
ROUTER --> ETA[ETA in seconds]
style ROUTER fill:#ff6b1a,color:#0a0a0f
style ML fill:#a855f7,color:#fff
Backbone: contraction hierarchies (CH) on a road graph. CH preprocesses the graph by iteratively "contracting" low-importance nodes and adding shortcut edges, creating a hierarchy (local roads → arterials → highways). A bidirectional Dijkstra search over this hierarchy visits only a few hundred nodes for a cross-city query, giving sub-millisecond exact shortest paths versus seconds for vanilla Dijkstra. ML is layered on top to learn "this segment at 5pm on a Friday normally takes 30% longer than the static weight."
Surge pricing
Surge is dynamic pricing in response to local imbalance — too many requests, too few drivers. The goals are to encourage more drivers to log on and to push rate-sensitive riders toward waiting or walking. A simple algorithm:
demand = ride requests in last 3 min in cell C
supply = available drivers in cell C
imbalance = demand / supply
multiplier = clip(0.7 + 3.0 × log(imbalance), 1.0, 4.0)
Calculated per H3 cell, recomputed every 30–60 seconds. The key insight is that it's local imbalance that drives price — a shortage in downtown has no bearing on the airport, and vice versa. In practice, ML models predict imbalance ahead of time and apply surge proactively, so the multiplier doesn't lag badly behind a sudden concert let-out.
flowchart LR
EVENTS[Location + ride events] --> KAFKA[(Kafka stream)]
KAFKA --> AGG[Surge aggregator<br/>30-60s windows]
AGG --> CELL{Imbalance by<br/>H3 cell}
CELL -->|demand > supply| MULT[Compute multiplier<br/>clip 1.0 - 4.0]
MULT --> CACHE[(Surge cache<br/>Redis)]
CACHE --> RIDER[Rider app<br/>shows surge]
CACHE --> DRIVER[Driver app<br/>shows hot zone]
style AGG fill:#ff6b1a,color:#0a0a0f
style CACHE fill:#15803d,color:#fff
style KAFKA fill:#a855f7,color:#fff
The trip lifecycle
stateDiagram-v2
[*] --> Requested
Requested --> Matched: driver accepts
Matched --> EnRoute: driver heading to pickup
EnRoute --> InTrip: rider in car
InTrip --> Completed: dropoff confirmed
Completed --> Paid: payment processed
Paid --> [*]
Requested --> Cancelled: rider cancels
Matched --> Cancelled: driver cancels (penalty)
EnRoute --> Cancelled
Each state transition is an event. The Trip Service is the source of truth; events go to Kafka for downstream consumers — analytics, payments, notifications. Nothing downstream holds authoritative state; they're just projections off the event stream.
Storage choices
| Data | Store |
|---|---|
| User accounts | Postgres |
| Driver locations (current) | Redis / in-memory |
| Driver locations (history) | Cassandra |
| Trips (active) | Postgres / DynamoDB |
| Trips (archive) | Cassandra / S3 |
| Geo index | Redis sorted sets / custom in-memory |
| Payments | Postgres (strong consistency, money) |
| Events / analytics | Kafka → S3 → BigQuery |
Hot path: ride request
Latency budget for "open app → driver dispatched":
| Step | ms |
|---|---|
| API gateway + auth | 30 |
| Geo lookup (drivers nearby) | 50 |
| ETA & ranking | 100 |
| Driver ping + ~15s response window | up to 15,000 |
| Dispatch confirm | 50 |
| Notify rider | 50 |
The driver acceptance window dominates everything else. All the machinery above — geo lookup, ETA ranking, atomic claim — completes in under 200ms. The Dispatch Service retries with the next candidate if a driver declines or times out, which means total wall-clock time before "no drivers available" can be 30–60 seconds in bad supply conditions. Uber has varied the acceptance window over the years (driver community reports and third-party sources cite windows in the roughly 10–20 second range, varying by market and period), trading rider wait time against driver quality of life.
Mobile considerations
The driver app is always-on, always-streaming. Battery and data are precious.
- Adaptive ping rate: 4s when moving, 10s when stationary.
- WebSocket for bidirectional communication (push ride requests, receive trip events) — or HTTP long-poll on poor networks.
- Foreground service on Android, background location on iOS, with explicit user consent.
Failure modes
Driver app loses connection mid-trip
The app caches recent state and reconciles when the connection returns. The server keeps the trip state authoritative.
Two riders match the same driver simultaneously
The atomic claim in Redis prevents this on the happy path:
SET driver:{id}:offer ride:{Y} NX EX <offer_window_seconds>
SET … NX EX is a single atomic command (available since Redis 2.6.12): it sets the key only if it does not already exist and sets its TTL in one operation. The older pattern of SETNX followed by a separate EXPIRE has a race — if the process crashes between the two calls, the key has no TTL and locks that driver slot forever. With SET NX EX, if two dispatch workers race, exactly one wins; the other gets a nil response and immediately tries its next candidate. On accept, the Dispatch Service does a final compare-and-set to transition the driver to IN_TRIP before telling the rider. No distributed lock is needed — the single Redis Cluster slot for that key serializes the SET.
sequenceDiagram
participant W1 as Dispatch Worker 1
participant W2 as Dispatch Worker 2
participant R as Redis
participant RID1 as Rider 1
participant RID2 as Rider 2
W1->>R: "SET driver:42:offer ride:A NX EX 15"
W2->>R: "SET driver:42:offer ride:B NX EX 15"
R-->>W1: OK (wins)
R-->>W2: nil (loses)
W1->>RID1: Driver 42 is on the way
W2->>R: "SET driver:99:offer ride:B NX EX 15"
W2->>RID2: Driver 99 is on the way
A region's data center fails
Cross-region replication. Failover routes new requests to a healthy region. Active trips in the failed region may be served from replicas with eventual consistency.
Things to discuss in an interview
- Spatial index: geohash or H3 cells; in-memory index keyed by cell.
- Dispatch as an optimization problem: not just nearest, but ETA + rider score + driver decline rate.
- Surge as a control loop: demand/supply → multiplier; updated per cell every 30–60 seconds.
- Event-driven trip lifecycle: state machine with Kafka events.
- Realtime location pipeline: batched, with TTL, in-memory.
Things you should now be able to answer
- Why is geohashing better than naive lat/lng comparison?
- Why does Uber use hexagonal cells (H3) instead of square cells?
- How do you ensure a driver isn't double-matched?
- What's the difference between greedy and batched matching?
- Why is surge pricing computed per cell, not globally?
Further reading
- "Engineering Real-time Indexing at Uber" — eng.uber.com
- H3 hexagonal indexing library — open source, github.com/uber/h3
- "Marketplace 101: matching & routing" — Lyft Engineering
Frequently asked questions
▸Why does Uber use H3 hexagonal cells instead of square geohash cells for the geo index?
Hexagons tile the sphere with uniform neighbor distance — every hexagon center is equidistant from all six neighboring centers, a property squares lack. Uber uses H3 resolution 9 cells (~0.1 km², ~174m edge length), where querying a cell plus its six neighbors always captures candidates within roughly 475m with no distance distortion at cell boundaries.
▸How does Uber prevent two riders from being matched to the same driver simultaneously?
The Dispatch Service issues a Redis SET driver:{id}:offer ride:{Y} NX EX command — a single atomic operation that sets the key only if absent and applies a TTL in one step. Exactly one dispatch worker wins; the other receives nil and immediately tries its next candidate. The older SETNX followed by a separate EXPIRE has a race condition: if the process crashes between those two calls, the key has no TTL and locks the driver slot permanently.
▸What write throughput must Uber's location pipeline handle, and why does that rule out a relational database?
With roughly 2 million drivers online at peak each pinging every 4-5 seconds, the system ingests approximately 450,000 location writes per second. That rate saturates any single relational database's write-ahead log, so live driver positions are stored in Redis (in-memory), with Postgres reserved only for durable state like accounts and completed trips.
▸What is batched bipartite matching in Uber's dispatch, and how does it improve on greedy nearest-first?
Rather than assigning each request to the closest available driver immediately, the Dispatch Service accumulates ride requests over 1-2 second windows, then solves a bipartite assignment problem — riders on one side, drivers on the other, edge weights equal to predicted ETA — using the Hungarian algorithm (O(n³)). Because the per-cell graph is small (tens of riders times tens of drivers), the solve finishes well under 100ms, and the result minimizes collective wait time across all riders rather than just optimizing each request greedily.
▸How is surge pricing calculated and why is it per cell rather than global?
A streaming aggregator reads location and ride events from Kafka, computes demand/supply ratio per H3 cell every 30-60 seconds, and applies the formula: multiplier = clip(0.7 + 3.0 x log(imbalance), 1.0, 4.0). The cap of 4x is reached at an imbalance of roughly 3 (3 requests per available driver), a realistic peak scenario. It is local by design — a shortage downtown has no bearing on the airport — and ML models predict imbalance ahead of time so the multiplier does not lag behind sudden demand spikes like a concert letting out.
You may also like
Design an LLM Observability Platform
Build the distributed tracing backbone for non-deterministic, multi-step LLM applications — capturing every prompt, completion, token count, and dollar cost across chains, retrievals, and tool calls so you can debug a failed agent run and account for every cent.
Design an LLM Gateway (AI Gateway & Model Router)
A single proxy control plane in front of OpenAI, Anthropic, Google, and open models — routing ~65 trillion tokens a month with automatic failover, semantic caching, per-team budget enforcement, and streaming SSE passthrough, all under 50 ms of added latency.
Design an LLM Fine-Tuning Platform
Turn a base model and a dataset into a deployed fine-tuned adapter at scale — the end-to-end platform covering dataset ingestion, LoRA/QLoRA/DPO training, fault-tolerant distributed GPU scheduling, eval gating, and multi-LoRA serving for hundreds of concurrent fine-tunes.