Design Google Maps (routing & navigation)
Find the fastest route across a continent-scale road graph in milliseconds, with live traffic and ETAs. Contraction hierarchies, live traffic weights, and tile serving.
The problem
Google Maps, Apple Maps, and Waze answer a deceptively simple question: what is the fastest way to drive from point A to point B right now? Open the app on a Tuesday morning in Los Angeles and in under a second you get a turn-by-turn route that accounts for a fender-bender on the 405, a lane closure near downtown, and the surge of school-run traffic — across a road network spanning hundreds of millions of edges.
The scale is staggering. Roughly one billion route requests hit Google's infrastructure every day. The underlying road graph for North America and Europe combined holds around 100 million nodes (intersections) and 250 million directed edges (road segments). Running Dijkstra cold on a graph that size for a cross-continent query would take several seconds per request — orders of magnitude over the 100 ms latency target. So the first engineering tension is query speed vs. graph size: the graph is too large for any online shortest-path algorithm, which means all the heavy lifting must happen in a preprocessing phase.
The second tension is freshness vs. stability. The preprocessed graph is built offline on a slow schedule; live traffic data flows in at 20 million events per second from phones actively navigating. Those two realities have to be reconciled so that the precomputed structure stays valid while traffic weights change continuously beneath it. Solve both tensions and you get a system that returns an accurate, traffic-aware route across a continent in under 100 ms.
The entire design is built around one insight: precompute as much as possible so that the online query does as little work as possible. Everything else — contraction hierarchies, the telemetry pipeline, the tile CDN — follows from that.
Functional requirements
GET /route?origin=...&destination=...&mode=driving→ turn-by-turn directions + ETA.GET /eta?origin=...&destination=...→ ETA in seconds (lighter, faster).- Real-time re-routing when the user deviates from the suggested path.
- Map tile rendering at configurable zoom levels.
- (Optional) Geocoding: search by address or place name → lat/lng.
- (Optional) Multiple route alternatives ranked by time/distance.
- (Optional) Route modes: driving, walking, cycling, transit.
Non-functional requirements
- Low latency: p99 route query < 100 ms; tile load < 200 ms.
- High availability: 99.99% — navigation failure in a moving car is dangerous.
- Graph freshness: road changes (new roads, closures) reflected within hours.
- Traffic freshness: segment weights updated within 30–60 seconds.
- Scale: tens of thousands of route queries per second globally.
Capacity estimation
| Dimension | Estimate | How we got there |
|---|---|---|
| Route requests (avg) | ~11,600 req/sec | 1 B/day ÷ 86,400 s |
| Route requests (peak) | ~35,000 req/sec | 3× average |
| Node data | ~10 GB | ~100 M nodes × ~100 B each |
| Edge data | ~12.5 GB | ~250 M edges × ~50 B each |
| CH shortcuts | ~12–25 GB | Augmentation factor ~2–3× on road networks (1–2× original edge count) |
| Graph total per region shard | ~35–48 GB | Node + edge + shortcuts → fits in RAM on a large server |
| Traffic telemetry events | ~20 M events/sec | 100 M active sessions × 1 ping/5 s |
| Traffic raw ingestion | ~4 GB/sec | 20 M events/sec × ~200 B each |
| Traffic aggregated state | ~2 GB | ~250 M segments × 8 B speed |
| Map tiles (zoom 0–14) | ~358 M tiles | (4^15 − 1) / 3 ≈ 358 M |
| Map tiles (zoom 15–20) | ~10 B tiles total (sparser land coverage) | Much lower density outside populated areas |
| Hot tiles (zoom 10–15, populated) | ~tens of millions | Fits in CDN edge cache with a warm-up period |
Takeaway: The graph for a continent fits comfortably in the RAM of a large server (64–128 GB RAM is sufficient; 256 GB servers are standard in this tier), which is why the routing service keeps the entire graph in memory.
The road graph data model
The fundamental abstraction is a weighted directed graph:
- Node: an intersection or road endpoint, stored with (lat, lng) and a unique integer ID.
- Edge: a directed road segment from node u to node v, with a weight representing expected travel time in seconds (not distance — we optimize for time). Edge metadata includes: speed limit, road class (highway / arterial / local), number of lanes, turn restrictions, one-way flag.
G = (V, E, w)
V = {intersection IDs}
E = {(u, v) : there is a directed road segment u → v}
w(u, v) = travel_time_seconds(u → v) [dynamic, updated with traffic]
Road data is sourced from a combination of official government datasets, commercial providers, and community-edited sources such as OpenStreetMap. The graph is rebuilt (re-preprocessed) on a schedule — typically nightly for major structural changes — and traffic weights are patched continuously without re-preprocessing.
Why plain Dijkstra is too slow
Dijkstra's algorithm on a graph with 250 M edges runs in O(E log V) time. A continental-scale query (SF → NYC, ~4 500 km) would need to relax tens of millions of edges before the destination is settled. Even with a fast priority queue, this takes several seconds — far too slow.
A* with a geographic heuristic helps somewhat. Using Euclidean or great-circle distance as the heuristic steers the search toward the destination, but on real road networks the heuristic is loose (roads don't go in straight lines), so A* still explores a large fraction of the graph on long queries.
The solution is preprocessing.
Contraction hierarchies (CH): the key algorithmic insight
Contraction hierarchies is a preprocessing technique that enables sub-millisecond to low-millisecond shortest-path queries on continental road graphs. It was introduced by Geisberger et al. in 2008 (building on earlier highway hierarchies work by Sanders and Schultes, 2005) and has become the dominant algorithm for road routing in production systems.
Preprocessing phase
The idea is to iteratively remove ("contract") nodes in order of importance, adding shortcut edges to preserve shortest-path distances. Think of it like collapsing a road network: local residential streets get contracted first, then arterials, until only the highway backbone remains.
Concretely:
- Assign an importance score to every node. Low-importance nodes are contracted first. Importance heuristics typically combine: edge difference (how many new shortcut edges contracting this node adds), number of settled neighbors, and road class of incident edges.
- To contract node u: for each pair of neighbors (v, w) where the shortest path from v to w goes through u, add a shortcut edge (v, w) with weight
w(v, u) + w(u, w). The shortcut means we can skip u entirely. - After contraction, assign each node a level (or rank) equal to the order in which it was contracted. Uncontracted nodes have higher rank; the highway-level backbone survives as the highest-rank nodes.
- The result: a hierarchical graph where long-distance queries only touch high-rank nodes (the highway skeleton), and short local queries only touch low-rank nodes.
flowchart TD
subgraph "Before contraction"
A1[A] -- 3 --> B1[B]
B1 -- 5 --> C1[C]
end
subgraph "After contracting B"
A2[A] -- "8 (shortcut A→C)" --> C2[C]
end
style A1 fill:#0e7490,color:#fff
style B1 fill:#ff6b1a,color:#0a0a0f
style C1 fill:#0e7490,color:#fff
style A2 fill:#0e7490,color:#fff
style C2 fill:#0e7490,color:#fff
Query phase
Once preprocessed, a query runs bidirectional search: one search expands upward through the hierarchy from the source; another expands upward from the destination. The two searches meet at or near the top of the hierarchy — the highway layer that both searches reach quickly.
forward_search(source): only relax edges going to higher-rank nodes
backward_search(dest): only relax edges going to higher-rank nodes (in reverse)
answer = min over all meeting nodes m: d_fwd(m) + d_bwd(m)
The following diagram shows how both searches "climb" toward the high-rank highway layer and meet in the middle, instead of fanning out across the entire graph:
flowchart LR
SRC[Source node<br/>rank: low] -->|"forward: climb ranks"| H1[Highway A<br/>rank: high]
DST[Dest node<br/>rank: low] -->|"backward: climb ranks"| H1
H1 -->|"meeting point → unpack path"| PATH[Full route polyline]
SRC --> H2[Highway B<br/>rank: high]
DST --> H2
H2 --> PATH
style SRC fill:#0e7490,color:#fff
style DST fill:#0e7490,color:#fff
style H1 fill:#ff6b1a,color:#0a0a0f
style H2 fill:#ff6b1a,color:#0a0a0f
style PATH fill:#15803d,color:#fff
Because both searches only move "up" the hierarchy, they settle quickly — typically a few hundred nodes per direction even for a cross-continent query, versus tens of millions for plain Dijkstra. Academic benchmarks on European road graphs (~18 M nodes) report query times well under 1 ms (often 100–500 µs on the hardware of the era, down to tens of microseconds on modern hardware); on larger continental graphs (~100 M nodes) queries typically resolve in the low-millisecond range or better in practice.
The precomputed shortcuts encode the implicit intermediate path; when the user wants turn-by-turn directions, the routing engine unpacks the shortcut chain to recover the full sequence of road segments.
Live traffic: dynamic edge weights
The CH preprocessing assumes fixed edge weights. Traffic breaks this assumption. The solution is a careful separation: the static CH shortcuts are computed on base weights (free-flow or historical average travel time) and rarely change. Traffic correction factors are computed per segment and applied on top at query time.
Scale factor approach
For each road segment (u, v), maintain a current speed ratio:
current_weight(u, v) = base_weight(u, v) × (base_speed / current_speed)
When the live speed on segment (u, v) is half the free-flow speed, its weight doubles. This factor is applied when relaxing edges during the bidirectional query.
The CH shortcuts must be invalidated or reweighted when the underlying segment weights change significantly. In practice, shortcuts are recomputed periodically (e.g., every few hours) for the most heavily affected parts of the graph, or a traffic-aware variant of CH (sometimes called time-dependent CH) is used where shortcuts store weight functions over time instead of scalars.
Traffic telemetry pipeline
Every phone running navigation is a sensor. Each device emits a GPS ping every few seconds containing timestamp, lat/lng, speed, heading, and trip ID. That's 100 M active sessions pinging every ~5 seconds — about 20 M events per second flowing into the system.
flowchart LR
PHONES[Navigation apps<br/>on phones] -->|GPS pings| INGEST[Ingest Service]
INGEST --> KAFKA[Kafka topic<br/>raw pings]
KAFKA --> AGG[Streaming Aggregator<br/>Flink / Spark Streaming]
AGG -->|"speed per segment, per 30s"| TRAF[(Traffic State Store<br/>Redis / in-memory)]
TRAF --> ROUTER[Routing Service]
style PHONES fill:#0e7490,color:#fff
style KAFKA fill:#a855f7,color:#fff
style AGG fill:#ff6b1a,color:#0a0a0f
style TRAF fill:#15803d,color:#fff
The ingest service map-matches each raw GPS coordinate to the nearest road segment (using a hidden Markov model or similar), then emits a (segment_id, observed_speed) event to Kafka. The streaming aggregator consumes these events and maintains a rolling window (e.g., last 3–5 minutes) of observed speeds per segment, outputting a per-segment speed estimate every 30–60 seconds to the traffic state store.
The routing service reads these weights from the traffic cache on every query — so every route computed reflects traffic from the last ~30–60 seconds.
Scale check: 20 M telemetry events/sec × ~200 B ≈ 4 GB/s ingested (at the 1 ping/5 s rate; 2 GB/s at 1 ping/10 s). This is well within the capacity of a Kafka cluster with adequate partition count (~200–500 partitions). The aggregated output is ~250 M segments × 8 B = ~2 GB of state, easily kept in a Redis cluster or in-process memory on the routing servers.
ETA prediction
A naive ETA is just the sum of current edge weights along the route. Production systems do better by layering multiple signals into an ML model:
| Signal | What it captures |
|---|---|
| Current edge weights (traffic) | Real-time congestion |
| Historical speed by hour + day | Predictable patterns (rush hour, weekend) |
| Incident data (accidents, closures) | Step-function slowdowns |
| Weather | Rain/snow increase travel times |
| ML model per segment | Non-linear interactions between the above |
A trained model (gradient-boosted trees or a neural network) takes a feature vector per segment — current speed ratio, hour of day, day of week, historical p50/p90 for that slot, recent incident flag — and outputs a predicted travel time. These per-segment predictions are summed to produce the route ETA.
The ML model is retrained offline (e.g., daily) on historical observations of "predicted ETA vs. actual arrival time." During navigation, ETA is recalculated at each re-routing check (every ~30 seconds or whenever the user deviates).
Building up to the design
V1: A* on the live graph, single server
Run A* with a great-circle heuristic. Works for a city. Continental routes take 5–15 seconds. Every query touches RAM-resident edge data; no preprocessing.
The problem is latency, 10–100× too slow for long routes, and a single server is a SPOF.
V2: Partition the graph by region
Divide the continent into cells (e.g., 100 km × 100 km). Precompute shortest paths between cell boundary nodes. A query routes within the source cell, crosses between cells via precomputed boundary paths, then routes within the destination cell. Inter-cell queries are much faster — but path quality degrades near cell boundaries, and partitioning artifacts produce suboptimal routes that are hard to maintain as the graph changes.
V3: Contraction hierarchies, static weights
Run full CH preprocessing on the entire graph. Queries drop to sub-millisecond to low single-digit ms. Preprocess nightly; reload without downtime using a blue/green deploy of the routing servers.
The remaining gap: routes ignore traffic. ETA is inaccurate during rush hour or incidents.
V4: CH + live traffic weights
Add the telemetry pipeline. Patch traffic weights on top of CH shortcuts at query time. ETA becomes accurate; re-routing happens in milliseconds. What's still missing is the full product — tile serving and geocoding.
V5: Production maps system
Add tile serving (pre-rendered, CDN), geocoding service, and the full ML ETA model. This is the design described in the rest of this article.
flowchart LR
V1["V1: A* on live graph<br/>city-scale, seconds"] --> V2["V2: + region partitioning<br/>faster but lower quality"]
V2 --> V3["V3: + CH preprocessing<br/>sub-ms to low-ms queries"]
V3 --> V4["V4: + live traffic<br/>accurate ETA"]
V4 --> V5["V5: + tiles + geocoding<br/>full maps product"]
style V1 fill:#0e7490,color:#fff
style V3 fill:#15803d,color:#fff
style V4 fill:#ff6b1a,color:#0a0a0f
style V5 fill:#a855f7,color:#fff
Full architecture
flowchart TD
CLIENT[Mobile / Web Client] --> CDN[CDN edge]
CLIENT --> GW[API Gateway / Load Balancer]
GW --> GEO_SVC[Geocoding Service]
GW --> ROUTE_SVC[Routing Service]
GW --> ETA_SVC[ETA Service]
GW --> NAV_SVC[Navigation / Re-route Service]
ROUTE_SVC --> CH_GRAPH[(CH Graph<br/>in-process RAM)]
ROUTE_SVC --> TRAF_CACHE[(Traffic Cache<br/>Redis)]
ROUTE_SVC --> ML_ETA[ML ETA Model]
ETA_SVC --> TRAF_CACHE
ETA_SVC --> ML_ETA
NAV_SVC --> ROUTE_SVC
CDN -->|tile miss| TILE_SVC[Tile Service]
TILE_SVC --> OBJ_STORE[(Object Store<br/>S3 / GCS)]
PHONES[GPS pings from phones] --> INGEST_SVC[Ingest Service]
INGEST_SVC --> KAFKA[Kafka]
KAFKA --> AGG[Traffic Aggregator<br/>Flink]
AGG --> TRAF_CACHE
GEO_SVC --> GEO_IDX[(Geocoding Index<br/>Elasticsearch / custom)]
style CDN fill:#ff6b1a,color:#0a0a0f
style ROUTE_SVC fill:#15803d,color:#fff
style CH_GRAPH fill:#0e7490,color:#fff
style KAFKA fill:#a855f7,color:#fff
style TRAF_CACHE fill:#ffaa00,color:#0a0a0f
style ML_ETA fill:#ff2e88,color:#fff
The routing service in detail
Graph storage
The CH graph is kept entirely in process memory on the routing server. For a continental graph (~35–48 GB after preprocessing), routing servers need ~64–128 GB RAM — standard hardware for server-class machines today.
The routing process holds two edge arrays:
- Forward edges (for the forward search from source).
- Backward edges (for the backward search from destination).
Both are stored as adjacency arrays sorted by source node ID, so looking up all outgoing edges of node u is a simple range scan.
A route request, step by step
sequenceDiagram
participant Client
participant GW as API Gateway
participant Geo as Geocoding Service
participant Router as Routing Service
participant TC as Traffic Cache
participant ML as ML ETA Model
Client->>GW: GET /route?origin="123 Main St"&dest="JFK Airport"
GW->>Geo: resolve origin address → (lat, lng)
GW->>Geo: resolve dest address → (lat, lng)
Geo-->>GW: origin_node_id, dest_node_id
GW->>Router: route(origin_node, dest_node)
Router->>TC: fetch current speeds for relevant segments
TC-->>Router: speed ratios
Router->>Router: bidirectional CH query (sub-ms to ~5 ms)
Router->>ML: predict ETA for each segment on path
ML-->>Router: ETA per segment
Router-->>GW: route polyline + total ETA
GW-->>Client: JSON response
Notice that the geocoding step resolves addresses to graph node IDs before the routing query even starts — geocoding and routing are cleanly separated. The total latency budget:
| Step | Typical ms |
|---|---|
| Geocoding (cached) | 5–20 |
| Traffic cache read | 2–5 |
| Bidirectional CH query | 0.1–5 |
| ETA model inference | 5–20 |
| Serialize + network | 5–10 |
| Total p50 | ~25–60 ms |
Re-routing
When a driver deviates from the suggested route, the navigation service detects the deviation (current position no longer matches the expected segment) and triggers a new route query from the current position. Because CH queries are so fast, re-routing is imperceptible to the user.
Re-routing also happens proactively: the navigation service periodically checks whether a different route would now be significantly faster given current traffic, and presents the alternative if so.
Map tile serving
The visible map is composed of tiles — small square image (raster) or geometry (vector) chunks served independently and stitched by the client.
Zoom levels and quadtrees
The world is divided using a quadtree scheme: at zoom level 0, the entire world is one tile (256×256 pixels). Each zoom increment divides each tile into 4 sub-tiles. At zoom level z, there are 4^z tiles total.
Zoom 0: 1 tile (world overview)
Zoom 10: ~1 M tiles (city level)
Zoom 14: ~268 M tiles (street level)
Zoom 18: ~68 B tiles (building level)
At zoom 18, a 256×256-pixel tile covers roughly 153 m × 153 m on the ground at the equator (0.6 m/pixel resolution) — enough to distinguish individual buildings and lanes. In practice, tiles beyond zoom 16 are only rendered on demand over populated areas.
Tile coordinates are expressed as (zoom, x, y) — the column and row within the grid at that zoom level. This triple uniquely identifies every tile in the world. The quadtree structure means zooming in on a tile simply queries the four children at the next level:
flowchart TD
Z0["Zoom 0: world (1 tile)"] --> Z1A["Zoom 1: NW"]
Z0 --> Z1B["Zoom 1: NE"]
Z0 --> Z1C["Zoom 1: SW"]
Z0 --> Z1D["Zoom 1: SE"]
Z1A --> Z2A["Zoom 2: NW·NW"]
Z1A --> Z2B["Zoom 2: NW·NE"]
Z1A --> Z2C["Zoom 2: NW·SW"]
Z1A --> Z2D["Zoom 2: NW·SE"]
style Z0 fill:#ff6b1a,color:#0a0a0f
style Z1A fill:#0e7490,color:#fff
style Z1B fill:#0e7490,color:#fff
style Z1C fill:#0e7490,color:#fff
style Z1D fill:#0e7490,color:#fff
style Z2A fill:#15803d,color:#fff
style Z2B fill:#15803d,color:#fff
style Z2C fill:#15803d,color:#fff
style Z2D fill:#15803d,color:#fff
Raster vs. vector tiles
| Property | Raster tiles (PNG/JPEG) | Vector tiles (PBF/MVT) |
|---|---|---|
| Content | Pre-rendered pixels | Raw geometry + labels |
| Tile size | ~20–100 KB per tile | ~5–50 KB per tile |
| Client rendering | Trivial (draw image) | Requires a WebGL/Metal renderer |
| Style changes | Requires re-rendering all tiles | Client re-styles without new tiles |
| Retina / scaling | Must regenerate at 2× resolution | Client scales geometry losslessly |
| Bandwidth | Higher | Lower |
Modern mapping clients (web and mobile) increasingly use vector tiles because they allow smooth zoom animations, night/day mode, and custom styles without re-downloading tiles. The server stores pre-generated vector tiles in object storage.
Tile generation and caching
Tiles are pre-generated offline from the road and map geometry data:
- Map data pipeline: raw OSM or commercial data → geometry normalization → feature extraction (roads, buildings, water, parks).
- Tile renderer: for each zoom level, render (or serialize geometry into) each tile covering populated areas. Store to object storage.
- CDN: all tile GETs go through the CDN. A tile not in CDN cache is fetched from the tile service, which reads from object storage and responds. The CDN caches the tile with a long TTL (days to weeks — tiles change only when the underlying data changes).
The vast majority of tile reads are CDN hits. Tile cache-miss rate is low because tiles at the popular zoom levels over populated areas are warm in the CDN. Updating a tile (because a new road was built) requires invalidating the CDN cache entry for that tile's (z, x, y) key.
Geocoding: the hidden complexity
Geocoding is a separate service that sits upstream of routing — routing only understands node IDs, not street names.
- Forward geocoding: "1600 Pennsylvania Ave NW, Washington DC" → (38.8977, -77.0366).
- Reverse geocoding: (37.7749, -122.4194) → "San Francisco, CA".
The geocoding service maintains an inverted index over address tokens and place names. Queries are fuzzy (typos, abbreviations, partial names). In practice, systems like this use a combination of an Elasticsearch or custom inverted index for text search, plus a ranking model that scores candidates by query match quality, population, and proximity to the user.
Geocoding resolves to a node ID in the road graph, which is then passed to the routing service.
Failure modes
Stale traffic data
If the Kafka pipeline or aggregator falls behind, the routing service may use outdated speed estimates. The fix is layered: set a TTL on each segment's traffic entry in the cache (e.g., 5 minutes) and fall back to the historical average for that hour if it expires. Monitor Kafka consumer lag; alert if lag exceeds 2 minutes. The fallback degrades ETA quality but keeps routing functional.
Graph update propagation
When a road closes (accident, construction), the update must reach all routing servers quickly. An operations team or automated incident feed publishes a "segment closure" event to Kafka; routing servers consume this event and mark the edge weight as effectively infinite (blocked) at query time — no full re-preprocessing needed. When the closure is lifted, the override is removed. For permanent changes (new roads), a full CH re-preprocessing is scheduled and the new graph binary is deployed with a blue/green swap.
Very long routes
Routes spanning multiple continents (e.g., through ferry connections or international roads) may touch graph regions stored on separate region shards. A routing coordinator decomposes the query into legs (one per region shard), executes each leg's CH query in parallel, and stitches the results.
Region failover
Routing servers are replicated across availability zones. The CH graph is read-only at query time (all writes are offline preprocessing), so any replica can serve any query. The load balancer health-checks each routing server and routes around failures.
Storage choices
| Data | Store | Rationale |
|---|---|---|
| CH graph (static) | Mmap'd binary file, in-process RAM | Fast random access; read-only after load |
| Traffic weights (live) | Redis cluster | Sub-millisecond reads; ~2 GB total state |
| Traffic raw events | Kafka | Durable stream; replay on aggregator restart |
| Map tiles | Object storage (S3/GCS) + CDN | Immutable after generation; CDN absorbs reads |
| Geocoding index | Elasticsearch / custom inverted index | Fuzzy text search; horizontal scaling |
| Road geometry (raw) | Postgres / custom spatial DB | Source of truth for preprocessing pipeline |
| ML ETA model weights | Object storage, loaded into routing server RAM | Model updated daily; versioned deploys |
Connection to ride-hailing
The ETA service described here is exactly what the Uber/Lyft dispatch system needs to rank driver candidates. The difference is scale and freshness: ride-hailing needs ETAs for thousands of (driver, rider) pairs per second in a single metro area; Google Maps needs route computation globally but can tolerate the route being computed once and cached for the navigation session.
In both cases, the routing backbone is a contraction hierarchy on a road graph with live traffic weights. Ride-hailing systems often run a lighter-weight, less precise routing engine (direct-line ETA estimates with traffic correction factors) for the candidate-ranking phase, reserving the full CH query for the confirmed route.
Things to discuss in an interview
- Why CH over plain Dijkstra/A*: preprocessing reduces query work from O(E log V) on a 250 M-edge graph to settling a few hundred nodes per direction. The preprocessing-vs-query trade-off is the crux.
- Traffic weight updates without full re-preprocessing: scale factors patched at query time; periodic partial re-preprocessing for heavily affected regions.
- ETA quality: naive sum of current edge weights underestimates congestion propagation; ML models trained on historical arrival times outperform static estimates.
- Tile caching economics: pre-render offline, serve via CDN; only popular tiles ever leave the CDN. Invalidation on road changes is the hard part.
- Re-routing latency: because CH queries are so fast (~ms), re-routing is imperceptible; no need for incremental re-routing algorithms.
- Graph sharding: split by geography; a routing coordinator stitches cross-region queries.
Things you should now be able to answer
- Why does Dijkstra fail at continental scale, and what does contraction hierarchies do differently?
- How are traffic weights incorporated into a preprocessed graph without full re-preprocessing?
- What is a map tile, and why are tiles served from CDN rather than generated on demand?
- How does the routing service handle road closures in near-real-time?
- Why is ETA from an ML model better than simply summing current edge weights?
- How does geocoding connect to the routing graph?
Further reading
- Geisberger et al., "Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks" (WEA 2008) — the foundational CH paper.
- Sanders and Schultes, "Highway Hierarchies Hasten Exact Shortest Path Queries" (ESA 2005) — the predecessor technique that CH improves upon.
- Bast et al., "Route Planning in Transportation Networks" (2016) — comprehensive survey of highway hierarchies, CH, and related techniques.
- OpenStreetMap wiki on routing — overview of how OSM data is used in production routing.
- "Engineering Fast Route Planning" — various engineering blog posts from routing infrastructure teams at mapping companies.
- Design Uber / Lyft — how the ETA service plugs into a dispatch system.
Frequently asked questions
▸What is contraction hierarchies and why does Google Maps use it instead of Dijkstra?
Contraction hierarchies is a preprocessing technique introduced by Geisberger et al. in 2008 that iteratively removes low-importance nodes from the road graph and adds shortcut edges, leaving only the highway-level backbone. A bidirectional query then climbs upward through this hierarchy from both source and destination, settling a few hundred nodes per direction instead of tens of millions. On a 250-million-edge continental graph, plain Dijkstra takes several seconds per query; CH resolves the same query in sub-millisecond to low-millisecond time.
▸How does live traffic get incorporated without re-running the full CH preprocessing?
Each road segment carries a current speed ratio computed from phone GPS pings, and the routing service multiplies the base edge weight by that ratio at query time: current_weight = base_weight x (base_speed / current_speed). The CH shortcuts themselves are only periodically re-preprocessed for heavily affected regions; for permanent closures, the segment weight is set to effectively infinite via a Kafka event consumed by routing servers, with no full rebuild required.
▸What is the traffic telemetry scale, and how is it processed?
100 million active navigation sessions each emit a GPS ping every 5 seconds, producing roughly 20 million events per second and about 4 GB per second of raw ingestion. An ingest service map-matches each ping to a road segment and publishes it to Kafka; a streaming aggregator (Flink or Spark Streaming) maintains a rolling window of observed speeds and outputs a per-segment estimate every 30 to 60 seconds to a Redis traffic state store holding roughly 2 GB of state across 250 million segments.
▸Why is an ML model used for ETA instead of just summing current edge weights?
Summing current edge weights underestimates congestion propagation and misses predictable patterns. The ML model (gradient-boosted trees or a neural network) takes per-segment features including current speed ratio, hour of day, day of week, historical p50/p90 for that time slot, and recent incident flags, then outputs a predicted travel time per segment. The model is retrained daily on historical predicted-vs-actual arrival observations and recalculated every 30 seconds or on deviation during active navigation.
▸How many map tiles exist at zoom level 14, and how does the CDN handle serving them?
At zoom level 14 there are approximately 268 million tiles total, of which roughly 78 million cover the Earth's land area. Tiles are pre-rendered offline from road and geometry data and stored in object storage; all GET requests go through a CDN with a TTL of days to weeks. The CDN absorbs over 95 percent of tile reads, and a tile miss falls through to the tile service backed by object storage.
You may also like
Design an LLM Observability Platform
Build the distributed tracing backbone for non-deterministic, multi-step LLM applications — capturing every prompt, completion, token count, and dollar cost across chains, retrievals, and tool calls so you can debug a failed agent run and account for every cent.
Design an LLM Gateway (AI Gateway & Model Router)
A single proxy control plane in front of OpenAI, Anthropic, Google, and open models — routing ~65 trillion tokens a month with automatic failover, semantic caching, per-team budget enforcement, and streaming SSE passthrough, all under 50 ms of added latency.
Design an LLM Fine-Tuning Platform
Turn a base model and a dataset into a deployed fine-tuned adapter at scale — the end-to-end platform covering dataset ingestion, LoRA/QLoRA/DPO training, fault-tolerant distributed GPU scheduling, eval gating, and multi-LoRA serving for hundreds of concurrent fine-tunes.