MODULE 04 / 12crash course

~/roadmap/04-apis-and-protocols

◆Beginner

APIs and Communication Protocols

REST, gRPC, GraphQL, WebSockets, Server-Sent Events, and webhooks — when to use each, how to design them, and the patterns that keep them sane at scale.

17 min read2026-01-18Ironclad Academy

#apis #rest #grpc #graphql #networking #fundamentals

The previous module covered the transport layer — how bytes move between machines. This module covers the layer above: how programs talk to each other over those bytes. APIs are the contracts; protocols are the conventions. Get them right and you can change anything underneath without breaking callers. Get them wrong and every change becomes a coordinated multi-team migration.

The four big families

Almost every system you'll design uses one or more of these:

flowchart TD
    API[Inter-service communication] --> SYNC[Synchronous<br/>request-response]
    API --> ASYNC[Asynchronous<br/>fire-and-forget]
    API --> PERSIST[Persistent<br/>full-duplex channel]
    SYNC --> REST[REST<br/>HTTP + JSON]
    SYNC --> GRPC[gRPC<br/>HTTP/2 + Protobuf]
    SYNC --> GQL[GraphQL<br/>HTTP + flexible queries]
    ASYNC --> SSE[SSE<br/>server → client only]
    ASYNC --> WH[Webhooks<br/>server → server callbacks]
    PERSIST --> WS[WebSocket<br/>bi-directional, persistent]
    style REST fill:#ff6b1a,color:#0a0a0f
    style GRPC fill:#0e7490,color:#fff
    style GQL fill:#ff2e88,color:#fff
    style WS fill:#15803d,color:#fff

The rough rule for picking one: reach for REST when you're building a public API and want something every client can hit with curl. Lean toward gRPC for internal microservices where you control both ends and care about payload size and strict contracts. Choose GraphQL when multiple clients (web, iOS, Android) each want different slices of deeply nested data. Upgrade to WebSockets when you need the server to push updates in real time and the client also needs to send events back. Use SSE when the server just needs to push a stream one-way. And webhooks are for when someone else's server needs to notify yours — you hand them a URL, they call it.

REST in depth

REST (Representational State Transfer) is less a protocol than a set of conventions on top of HTTP. Done well, it gives you a uniform, cacheable, debug-friendly API. Done poorly, it gives you JSON-over-HTTP without the benefits.

The six REST constraints (in plain English)

Client–server — separation of concerns; clients evolve independently.
Stateless — every request carries everything the server needs. No session memory between calls.
Cacheable — responses are explicitly marked cacheable or not.
Uniform interface — resources, methods, representations are predictable.
Layered — proxies, CDNs, gateways can sit between client and origin.
Code on demand (optional, rarely used).

The big one is stateless. If your "REST" API needs server-side session state to make sense of the next call, you can't horizontally scale it without sticky sessions, and you've thrown away half the value.

Resource-oriented design

A REST API models the world as resources addressed by URLs, manipulated through a small set of HTTP methods.

GET    /orders              → list orders
POST   /orders              → create order
GET    /orders/123          → read order 123
PUT    /orders/123          → replace order 123
PATCH  /orders/123          → modify fields of order 123
DELETE /orders/123          → delete order 123

GET    /orders/123/items    → list items in order 123
POST   /orders/123/items    → add an item to order 123

Anti-pattern: POST /createOrder, POST /updateOrderStatus, POST /deleteOrder. That's RPC dressed up in HTTP — you've lost the uniform interface.

Verb-action mismatches that bite people

You want to...	Actually do
Cancel an order	`POST /orders/123/cancellations` (creates a cancellation resource) or `PATCH /orders/123 {"status":"cancelled"}`
Send a password reset email	`POST /password-resets` (creates a reset request, returns a token)
Search	`GET /orders?status=open&customer_id=42` (filters as query params)
Bulk update	`POST /orders/bulk-update` — pure REST has no good answer; this is the pragmatic exception

Status codes that pull their weight

You don't need all 60+. You need twelve:

Code	Meaning	When
200	OK	Successful read or update
201	Created	Successful POST that created a resource (return `Location: /orders/123`)
202	Accepted	Async work queued; client should poll
204	No Content	Successful DELETE or empty PUT
400	Bad Request	Malformed body, missing fields, validation failed
401	Unauthorized	No / bad credentials (yes, the name is wrong — should be "Unauthenticated")
403	Forbidden	Authenticated but not allowed
404	Not Found	Resource missing
409	Conflict	Optimistic concurrency conflict, duplicate key
429	Too Many Requests	Rate limited (include `Retry-After` header)
500	Internal Server Error	Unexpected bug — fix it
503	Service Unavailable	Overloaded or in maintenance — retry later

Pagination — the four flavors

Listing resources without pagination is a footgun. Four ways to paginate, and only one of them holds up at scale:

flowchart TD
    P[Pagination strategies] --> OL[Offset/Limit<br/>?page=3&size=20]
    P --> CB[Cursor-based<br/>?cursor=eyJpZCI6MTIz]
    P --> KB[Keyset / seek<br/>?after_id=123&limit=20]
    P --> TB[Time-based<br/>?since=2026-01-01]
    style OL fill:#ff2e88,color:#fff
    style CB fill:#15803d,color:#fff
    style KB fill:#0e7490,color:#fff

Offset/limit (?page=3&size=20) is the beginner's choice. Simple to implement, easy to explain — and a trap at scale. OFFSET 1,000,000 forces the database to scan and discard a million rows on every page request. Worse, new inserts shift page boundaries while you're paginating, so rows get skipped or duplicated.

Cursor-based pagination is the fix. The server returns an opaque next_cursor token with each page; the client passes it back to get the next one. Under the hood, the cursor usually encodes the keyset value (the last id seen). No offset scan, no drift under inserts, no ability to jump to page N — which turns out to be fine for nearly every real use case.

Keyset / seek (WHERE id > last_id ORDER BY id) is essentially cursor-based made explicit — same database behavior, just without the opaque token.

Time-based (?since=...) works naturally for event feeds but only when data is genuinely chronological and you don't need to navigate backwards.

For a public list endpoint with mutating data, cursor-based wins. Always include a next_cursor in the response and let the client pass it back.

Versioning without breaking the world

Three strategies:

URL versioning: /v1/orders, /v2/orders. Most common, most explicit, easiest to route at the edge.
Header versioning: Accept: application/vnd.example.v2+json. Cleaner URLs, harder to debug.
No versioning, evolve forward only: never break, only add. Stripe's approach (with date-based "API versions" as fallback). Highest discipline cost; lowest cognitive overhead for callers.

The forgotten rule: never break v1 while v2 is live. Run both for as long as a real customer is on v1.

Filtering, sorting, sparse fields

GET /orders?status=paid&customer_id=42       ← filtering
GET /orders?sort=-created_at,total           ← sorting (- = desc)
GET /orders?fields=id,total,status           ← sparse fields (only return these)
GET /orders?include=customer,items           ← side-load related resources

These let one endpoint serve many UI needs without proliferating endpoints. Pick a convention early and apply it consistently.

Idempotency keys (the production-grade superpower)

Networks fail mid-request. The client retries. Did the first attempt go through? Without idempotency, you've now charged the customer twice.

The fix: the client generates a UUID and sends it in an Idempotency-Key header. The server stores (key, response) for ~24h. If the same key arrives again, the server returns the original response without re-executing.

POST /payments
Idempotency-Key: 7f3c2e80-4b5d-4a8e-9f2c-1234567890ab
Content-Type: application/json

{"amount": 4200, "currency": "usd", "source": "tok_..."}

Stripe pioneered this; everybody copied it. It is the single most important pattern for making a write API safe to retry.

gRPC: the internal-service workhorse

gRPC is Google's RPC framework: HTTP/2 transport, protobuf for serialization, code generation in every major language. You define your API once in a .proto file:

syntax = "proto3";

service Orders {
  rpc Get(GetOrderRequest) returns (Order);
  rpc List(ListOrdersRequest) returns (stream Order);
  rpc Create(CreateOrderRequest) returns (Order);
}

message Order {
  string id = 1;
  int64  customer_id = 2;
  int32  total_cents = 3;
  Status status = 4;
}

A protoc compiler generates client and server stubs in Go, Java, Python, Rust, etc. Calls look like local function calls.

Why teams choose gRPC for internal services: Protobuf binary is typically 3–10× smaller than equivalent JSON, so payloads are smaller and parsing is faster, while HTTP/2 multiplexing lets many in-flight requests share one connection and cuts connection overhead significantly. The schema is the contract — breaking changes surface at compile time, not at 2 a.m. when a caller starts returning garbage. Streaming (client, server, or bi-directional) is a first-class feature, not a bolt-on. And deadlines and cancellation are built into the protocol, so a client's timeout propagates through the entire call chain.

The flip side: you can't debug a binary protobuf frame with curl — you need grpcurl and dedicated tooling. Browsers can't speak gRPC natively, so client-facing endpoints need a gRPC-Web proxy in front. And schema evolution requires discipline: you cannot reuse a protobuf field number, ever, or you silently corrupt data on older clients.

gRPC streaming patterns

flowchart LR
    subgraph S1[Unary]
    C1[Client] -->|1 req| S
    S -->|1 resp| C1
    end
    subgraph S2[Server streaming]
    C2[Client] -->|1 req| Sa
    Sa -->|stream| C2
    end
    subgraph S3[Client streaming]
    C3[Client] -->|stream| Sb
    Sb -->|1 resp| C3
    end
    subgraph S4[Bi-directional]
    C4[Client] <-->|stream| Sc
    end
    style Sa fill:#0e7490,color:#fff
    style Sb fill:#15803d,color:#fff
    style Sc fill:#ff6b1a,color:#0a0a0f

Server streaming is great for live tailing (logs, prices). Bi-directional is how chat services and live game servers move data.

GraphQL: the client-controlled query

GraphQL is a query language for APIs. Instead of the server defining endpoints, the client declares the shape of the response:

query {
  order(id: "123") {
    id
    total
    customer {
      name
      email
    }
    items {
      product { name price }
      quantity
    }
  }
}

Server returns exactly those fields, in that shape, in one round-trip. No more "GET /orders/123, GET /orders/123/customer, GET /orders/123/items" cascades.

What GraphQL gets right: Mobile apps love it — one request, exactly the bytes needed, no over-fetching. Multiple frontends (web/iOS/Android) can each ask for what they need without backend changes. The schema is strongly typed and introspectable; generators give you typed clients automatically.

What GraphQL gets wrong: The N+1 problem. A naive resolver for items.product runs a separate DB query per item in the list. The fix is DataLoader, which batches lookups within a single event loop tick. You will write or import this; you will not get away with not. Caching is also harder — HTTP caching is path and query-string-based, but all GraphQL traffic is POST /graphql. You need persisted queries or a GraphQL-aware cache. Authorization is per-field rather than per-endpoint, which is more flexible but means more code. And exposing a fully flexible query API to the public is a footgun — limit query depth and complexity or someone will hit you with users { posts { comments { user { posts { ... } } } } }.

Use GraphQL when you have multiple clients with diverging needs and one team owning the schema. Avoid it for simple, single-client CRUD — REST is less rope to hang yourself with.

REST vs gRPC vs GraphQL — the side-by-side

Aspect	REST	gRPC	GraphQL
Transport	HTTP/1.1 or HTTP/2	HTTP/2 (required)	HTTP/1.1, HTTP/2
Payload	JSON, XML	Protobuf (binary)	JSON
Schema	OpenAPI (optional)	`.proto` (required)	SDL (required)
Client codegen	Optional	Built-in	Built-in
Browser-friendly	Yes	No (needs gRPC-Web)	Yes
Streaming	SSE, WS bolt-ons	Native	Subscriptions (over WS)
Caching	HTTP caching just works	DIY	Hard
Debug with curl	Yes	No	Sort of
Best for	Public APIs	Internal microservices	Multi-client apps with deep data

Real-time: WebSockets vs SSE vs long polling

When the server needs to push to the client, you have three options — and the right one is often simpler than you'd expect.

Long polling (1990s technology, still works)

Client opens a request; server holds it open until there's something to send (or timeout); client immediately reopens it.

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: GET /events?since=42
    Note over S: holds open<br/>up to 30s
    S-->>C: {"event": "new message"}
    C->>S: GET /events?since=43
    Note over S: holds open
    S-->>C: timeout / empty
    C->>S: GET /events?since=43

Long polling works through every proxy, firewall, and ancient client — it's plain HTTP with a long timeout. The cost is reconnect latency: each new batch of events requires a fresh round-trip, and many holding connections can strain server resources.

Server-Sent Events (SSE)

A long-lived HTTP response that streams text/event-stream. One-way server → client.

GET /events HTTP/1.1
Accept: text/event-stream

HTTP/1.1 200 OK
Content-Type: text/event-stream

event: message
data: {"id": 42, "text": "hi"}

event: ping
data: {}

SSE is dead simple: a native browser API (new EventSource('/events')), automatic reconnect with Last-Event-ID, and no extra libraries needed. The constraint is directionality — it's server to client only. If the client also needs to send events, add a separate POST endpoint or step up to WebSocket. One practical footnote: HTTP/1.1 limits browsers to six concurrent connections per origin total — shared across all open tabs. If you have three tabs each holding two SSE connections to the same origin, you've hit the ceiling and every other request from every tab queues. This is a known "Won't fix" in Chrome and Firefox; the answer is HTTP/2, which multiplexes streams over a single connection.

WebSockets

A full-duplex TCP connection that starts as HTTP and "upgrades" to WS:

GET /chat HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

HTTP/1.1 101 Switching Protocols

After the upgrade, both sides exchange framed messages until either closes. Bi-directional, low per-message overhead, near-zero latency once connected. The cost: the connection is stateful, which pins it to one server and complicates load balancing (see Load Balancers for sticky sessions). Proxies can behave oddly with long-lived WebSocket connections. You also own reconnection and resubscribe logic — the protocol doesn't hand you that.

Picking between them

flowchart TD
    Q1{Need server → client push?}
    Q1 -->|No| REST[Plain REST or polling]
    Q1 -->|Yes| Q2{Need client → server too?}
    Q2 -->|No, just push| SSE[Server-Sent Events]
    Q2 -->|Yes, bi-directional| Q3{Latency-critical?}
    Q3 -->|Yes| WS[WebSocket]
    Q3 -->|No| WS2[WebSocket or LP]
    style SSE fill:#15803d,color:#fff
    style WS fill:#0e7490,color:#fff

SSE is criminally underused. For "send me notifications as they happen" — the case for 80% of "real-time" features in web apps — it's simpler and safer than WebSocket. Reach for WebSocket only when you genuinely need the client to push data back at high frequency (chat, collaborative editing, live gaming).

Webhooks: someone else's server calls yours

A webhook is a callback URL you register with someone else's service. When an event happens there, they POST to your URL.

sequenceDiagram
    participant App as Your App
    participant S as Stripe
    participant U as User
    Note over App,S: setup time
    App->>S: register webhook URL
    Note over App,S: later
    U->>S: completes payment
    S->>App: POST /webhooks/stripe<br/>(signed payload)
    App-->>S: 200 OK (within 10s — Stripe's limit)

Five rules for receiving webhooks:

Verify the signature. Every reputable provider signs payloads (HMAC). Reject anything unsigned.
Respond within their timeout (usually 5–30s). Do the actual work asynchronously — enqueue a job, return 200, process later.
Be idempotent. Webhooks retry. Use the event ID to dedupe.
Handle out-of-order events. The provider does not guarantee order on retries.
Replay-safe. Persist the raw event before processing, so you can rebuild state from the log.

Sending webhooks (you become the provider) adds: signed payloads, exponential-backoff retries, a delivery dashboard, and a mechanism for customers to disable a flapping endpoint.

How a request flows through these layers

It helps to see all of this together. Here's the path a typical API request takes from a mobile client through to the database — every component from this module appears somewhere in that chain.

flowchart LR
    MOB[Mobile client] -->|REST over HTTPS| GW[API Gateway]
    GW -->|rate limit check| RL[Rate Limiter]
    RL -->|allowed| AUTH[Auth middleware]
    AUTH -->|JWT verified| SVC[Service]
    SVC -->|gRPC| DS[Downstream service]
    SVC -->|WebSocket push| WS[Realtime clients]
    SVC --> DB[(Database)]
    SVC -->|webhook callback| EXT[External partner]
    style GW fill:#ff6b1a,color:#0a0a0f
    style RL fill:#ffaa00,color:#0a0a0f
    style SVC fill:#0e7490,color:#fff
    style DS fill:#a855f7,color:#fff
    style WS fill:#15803d,color:#fff

The mobile client speaks REST to the API gateway, which enforces rate limits and verifies auth before the request reaches your service. Internally, that service calls downstream services over gRPC. It pushes updates out to connected browser clients over WebSocket. And if an external partner needs to know about the event, the service fires a signed webhook.

Authentication patterns (a tour)

Every API needs to identify callers. The four patterns you'll see:

Pattern	How it works	When
API key	Long random string in header	Internal services, simple SaaS
Bearer token (JWT)	Signed token; server verifies signature	User-facing, stateless
OAuth 2.0	Token issued by a third-party identity provider	"Sign in with Google", delegated access
mTLS	Each side presents a TLS cert	Service mesh, B2B integrations

JWT pitfall: putting too much in the token. JWTs cannot be revoked once issued. Either keep them short-lived (15 min) with a refresh token, or maintain a revocation list (and you've lost statelessness).

Error envelopes and partial failures

A consistent error shape is worth more than any specific spec. RFC 7807 (application/problem+json) is a good default:

{
  "type": "https://api.example.com/errors/insufficient-funds",
  "title": "Insufficient funds",
  "status": 402,
  "detail": "Account 1234 has balance $42.10; requested charge $50.00.",
  "instance": "/accounts/1234/charges",
  "trace_id": "01HX..."
}

The trace_id is the killer field — it's the breadcrumb your support team uses to find the request in your logs.

For batch endpoints that partially fail, return per-item statuses:

{
  "results": [
    {"id": "a", "status": 200},
    {"id": "b", "status": 409, "error": "duplicate"}
  ]
}

Don't return 500 for "9 of 10 worked" — that throws away the 9 successes.

Rate limiting (a preview)

Every public API limits requests per caller. The two algorithms you'll see:

Token bucket: bucket holds N tokens, refills at R per second. Each request consumes one. Allows bursts up to N.
Sliding window: count requests in the last 60 seconds; reject if over limit.

When rate limited, return 429 Too Many Requests with:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1736284800

Full coverage is in Reliability Patterns and the rate limiter article.

A quick smell test for any API design

Before you ship, walk through this checklist:

Tick these and you have an API that won't be the source of next quarter's incidents.

Things you should now be able to answer

A POST request times out — your client retries. How does the server know not to charge twice?
Why is offset/limit pagination dangerous on a busy table?
Your mobile app needs three pieces of data on one screen and you currently have three REST endpoints. What are the trade-offs of consolidating?
A coworker proposes WebSockets for "real-time notifications". Why might SSE be a better fit?
Why can't you safely "revoke" a JWT?
gRPC streams beat REST polling for live data — but what's the cost in operational complexity?

→ Next: Databases

// KEY TAKEAWAYS

▸Statelessness is the load-bearing REST constraint: if the server needs session memory to interpret the next request, you cannot horizontally scale without sticky sessions.
▸Use cursor-based pagination on any list endpoint where the underlying data mutates; offset/limit causes both performance degradation and row-skip bugs at scale.
▸Protobuf binary is 3 to 10 times smaller than equivalent JSON, making gRPC the right default for internal microservices where you control both ends.
▸SSE covers roughly 80 percent of real-time web features with far less operational complexity than WebSocket; use WebSocket only when the client must also push data at high frequency.
▸JWTs cannot be revoked once issued, so keep them short-lived (15 minutes) with a refresh token, or accept that you need a revocation list and lose statelessness.

// FAQ

Frequently asked questions

▸What is an idempotency key and why does it matter for write APIs?

An idempotency key is a UUID the client generates and sends in an Idempotency-Key header on write requests. The server stores the key paired with the original response for roughly 24 hours; if the same key arrives again — because the client retried after a timeout — the server returns the original response without re-executing the operation. This is the single most important pattern for making a write API safe to retry without double-charging a customer.

▸Why is offset/limit pagination dangerous at scale, and what should you use instead?

OFFSET 1,000,000 forces the database to scan and discard a million rows on every page request, and concurrent inserts shift page boundaries so rows get skipped or duplicated. Cursor-based pagination fixes both problems: the server returns an opaque next_cursor token encoding the last ID seen, the client passes it back, and the database uses a keyset seek with no offset scan and no drift under inserts.

▸When should you choose gRPC over REST for inter-service communication?

Choose gRPC when you control both ends of the connection and care about payload size and strict contracts. Protobuf binary is typically 3 to 10 times smaller than equivalent JSON, HTTP/2 multiplexing lets many in-flight requests share one connection, and breaking schema changes surface at compile time rather than at runtime. The trade-off is that you cannot debug gRPC frames with curl and browsers cannot speak gRPC natively without a gRPC-Web proxy.

▸What is the key difference between SSE and WebSockets, and when should you prefer SSE?

SSE is a one-way server-to-client stream over a long-lived HTTP response, while WebSocket is a full-duplex TCP connection where both sides send frames freely. For the common case of pushing notifications to a browser, SSE is simpler: it uses a native browser API, reconnects automatically with Last-Event-ID, and requires no extra libraries. Reach for WebSocket only when the client must also send data back at high frequency, such as in chat, collaborative editing, or live gaming.

▸What are the three versioning strategies for a REST API?

URL versioning places the version in the path such as /v1/orders and is the most explicit and easiest to route at the edge. Header versioning encodes the version in the Accept header, producing cleaner URLs but harder debugging. The third approach is no versioning at all, evolving forward only by never breaking existing behavior and only adding — Stripe's method, which has the lowest cognitive overhead for callers but the highest discipline cost internally.

← previous module

Networking and HTTP

next module →

Databases — SQL, NoSQL, NewSQL