// TOPIC

#search

12 articles

Design a GraphRAG System (Knowledge-Graph-Augmented Retrieval)

When vanilla vector RAG fails on "summarize the entire corpus" and multi-hop questions, you build a knowledge graph first — covering entity extraction, Leiden community detection, map-reduce global search, and graph traversal for multi-hop, based on Microsoft GraphRAG and production deployments at Neo4j, LinkedIn, and Writer.

#interview#ai#rag

27 min

◆◆◆AdvancedUberAirbnb

Design a Feature Store

Serve the exact same feature values to model training and online inference — eliminating training-serving skew — across batch, streaming, and on-demand tiers at sub-10ms latency and millions of reads per second. The architecture powering Uber Michelangelo, Airbnb Chronon, and DoorDash Gigascale.

#interview#ai#mlops

26 min

◆◆◆AdvancedGoogleAWS

Design an Intelligent Document Processing Pipeline

Turn millions of messy PDFs, scans, and invoices into validated structured JSON at scale — the end-to-end pipeline covering OCR, layout analysis, LLM-based field extraction, confidence-scored routing, human-in-the-loop review, and the cost math that determines build-vs-buy.

#interview#ai#llm

28 min

◆◆◆AdvancedGitHubCursor

Design an AI Coding Assistant (Copilot / Cursor)

Architect a system that delivers inline ghost-text completions in under 200ms and drives an autonomous agent that edits dozens of files — the two-product architecture behind GitHub Copilot, Cursor, and Sourcegraph Cody at billions of completions per day.

#interview#ai#llm

31 min

◆◆◆AdvancedPineconeGoogle

Design a Vector Database / Semantic Search Service

Index 1 billion 768-dimensional vectors and answer top-k similarity queries in under 20 ms — the ANN indexing, sharding, and filtering architecture behind Pinecone, Weaviate, and pgvector.

#interview#ai#vector-db

23 min

◆◆◆AdvancedOpenAIGoogle

Design a RAG (Retrieval-Augmented Generation) Pipeline

Ground an LLM in 10 million documents (50 million chunks) with sub-2-second answers and a hallucination rate measurable by automated eval — the end-to-end ingestion, retrieval, reranking, and generation pipeline powering enterprise knowledge assistants.

#interview#ai#rag

28 min

◆◆◆AdvancedElasticSplunk

Design a Centralized Log Aggregation System (ELK / Splunk)

Collect, store, and search logs from thousands of services. Collection agents, a buffered ingestion pipeline, time-based inverted indices, hot-warm-cold tiers, and cost control.

#interview#observability#search

25 min

◆◆◆AdvancedElasticAmazon

Design a Distributed Search Engine (Elasticsearch)

Index billions of documents and answer full-text queries in milliseconds. Inverted indexes, sharding + replication, scatter-gather, and relevance scoring.

#interview#search#indexing

21 min

◆◆◆AdvancedGoogleMicrosoft

Design an Email Service (Gmail)

Send, receive, store, and search email for hundreds of millions of users. SMTP ingestion, sharded mailbox storage, full-text search, and spam filtering.

#interview#storage#search

23 min

◆◆◆AdvancedAirbnbBooking

Design a Hotel / Airbnb Booking System

Search available listings and book date ranges without double-booking. Availability as a range problem, reservation holds, and the search vs transaction split.

#interview#inventory#consistency

21 min

◆◆IntermediateGoogleYelp

Design Yelp / Nearby Search (proximity service)

Find restaurants/businesses near a location, fast. Geohash, quadtree, hexagonal cells, and the right index for "within 5 km of me".

#interview#geo#search

16 min

◆◆IntermediateGoogleAmazon

Design Search Autocomplete (Typeahead)

Sub-100ms autocomplete suggestions across billions of queries — tries, top-k caching, and personalized ranking.

#interview#search#trie

15 min