#search
12 articles
Design a GraphRAG System (Knowledge-Graph-Augmented Retrieval)
When vanilla vector RAG fails on "summarize the entire corpus" and multi-hop questions, you build a knowledge graph first — covering entity extraction, Leiden community detection, map-reduce global search, and graph traversal for multi-hop, based on Microsoft GraphRAG and production deployments at Neo4j, LinkedIn, and Writer.
Design a Feature Store
Serve the exact same feature values to model training and online inference — eliminating training-serving skew — across batch, streaming, and on-demand tiers at sub-10ms latency and millions of reads per second. The architecture powering Uber Michelangelo, Airbnb Chronon, and DoorDash Gigascale.
Design an Intelligent Document Processing Pipeline
Turn millions of messy PDFs, scans, and invoices into validated structured JSON at scale — the end-to-end pipeline covering OCR, layout analysis, LLM-based field extraction, confidence-scored routing, human-in-the-loop review, and the cost math that determines build-vs-buy.
Design an AI Coding Assistant (Copilot / Cursor)
Architect a system that delivers inline ghost-text completions in under 200ms and drives an autonomous agent that edits dozens of files — the two-product architecture behind GitHub Copilot, Cursor, and Sourcegraph Cody at billions of completions per day.
Design a Vector Database / Semantic Search Service
Index 1 billion 768-dimensional vectors and answer top-k similarity queries in under 20 ms — the ANN indexing, sharding, and filtering architecture behind Pinecone, Weaviate, and pgvector.
Design a RAG (Retrieval-Augmented Generation) Pipeline
Ground an LLM in 10 million documents (50 million chunks) with sub-2-second answers and a hallucination rate measurable by automated eval — the end-to-end ingestion, retrieval, reranking, and generation pipeline powering enterprise knowledge assistants.
Design a Centralized Log Aggregation System (ELK / Splunk)
Collect, store, and search logs from thousands of services. Collection agents, a buffered ingestion pipeline, time-based inverted indices, hot-warm-cold tiers, and cost control.
Design a Distributed Search Engine (Elasticsearch)
Index billions of documents and answer full-text queries in milliseconds. Inverted indexes, sharding + replication, scatter-gather, and relevance scoring.
Design an Email Service (Gmail)
Send, receive, store, and search email for hundreds of millions of users. SMTP ingestion, sharded mailbox storage, full-text search, and spam filtering.
Design a Hotel / Airbnb Booking System
Search available listings and book date ranges without double-booking. Availability as a range problem, reservation holds, and the search vs transaction split.
Design Yelp / Nearby Search (proximity service)
Find restaurants/businesses near a location, fast. Geohash, quadtree, hexagonal cells, and the right index for "within 5 km of me".
Design Search Autocomplete (Typeahead)
Sub-100ms autocomplete suggestions across billions of queries — tries, top-k caching, and personalized ranking.