~/articles/forward-vs-reverse-proxy

◆Beginnerasked at Cloudflareasked at Nginxasked at AWS

Forward vs Reverse Proxy (and API Gateway)

What "proxy" actually means in different contexts. Forward proxies, reverse proxies, load balancers, API gateways — what each does and where the lines blur.

15 min read2026-03-03Ironclad Academy

#networking #proxy #architecture #load-balancing

// DEPTH

the full breakdown — requirements, capacity, evolution, trade-offs

"Proxy" is one of the most overloaded words in infrastructure. A forward proxy and a reverse proxy share a name and almost nothing else. Then add load balancer and API gateway and the picture gets murkier. This article draws the lines clearly: what each is, who it protects, who knows about it, and what it does at the byte level.

The one-line distinction

flowchart LR
    subgraph "Forward proxy"
    direction LR
    C1[Client] --> FP[Forward Proxy] --> S1[Server]
    end
    subgraph "Reverse proxy"
    direction LR
    C2[Client] --> RP[Reverse Proxy] --> S2[Server]
    end
    style FP fill:#ff6b1a,color:#0a0a0f
    style RP fill:#15803d,color:#fff

Both sit in the middle. The difference is who they represent:

A forward proxy represents the client. The server doesn't know about it.
A reverse proxy represents the server. The client doesn't know about it.

That single difference cascades into every property — who configures it, what it logs, what it can rewrite, what threats it mitigates.

Forward proxy

The client opts in: "instead of talking to the internet directly, send everything through this proxy." From the server's view, all requests appear to come from the proxy.

flowchart LR
    L[Laptop] -->|"all egress traffic"| FP[Forward Proxy<br/>company.proxy:8080]
    FP --> I[Internet]
    I --> G[Google]
    I --> GH[GitHub]
    I --> SP[Some site]
    style FP fill:#ff6b1a,color:#0a0a0f

Corporate IT is the canonical user. Every employee's traffic runs through a proxy that logs requests, blocks malware sites, enforces content policy, and caches shared downloads like OS updates and npm packages. Schools and coffee-shop networks do the same thing for different reasons. Tor extends the idea into a chain of forward proxies to anonymize the client. VPNs achieve a similar outcome at a lower layer — they build an encrypted L3/L4 tunnel that routes all OS traffic, not just HTTP, through a remote exit node. In the end the destination server sees the exit node's IP instead of yours, just like a forward proxy, but the mechanism is a network tunnel rather than an application-layer interception.

The fundamental thing a forward proxy gives you is visibility into, and control over, outbound traffic. It can filter which domains employees reach, detect data leaks (DLP), cache downloads that the whole office needs, and hide the real client IP from the destination. What it cannot naturally do is look inside TLS connections — the encryption is end-to-end between the laptop and the destination, and the proxy just sees an opaque stream. To inspect that traffic, corporate proxies perform TLS interception: they terminate the TLS session from the laptop, decrypt it, re-encrypt toward the real destination, and hold the cleartext in the middle. The only reason the laptop trusts this is that IT pre-installed a corporate root CA on every device, making the proxy's dynamically generated certificates appear valid. The "Cisco Umbrella" or "Zscaler" cert chain you sometimes see in a browser on a corporate machine is this at work.

Reverse proxy

The server operator deploys it. From the client's view, the reverse proxy is the server. The real backends sit behind it, often unaddressable from the public internet.

flowchart LR
    C[Clients] --> RP[Reverse Proxy<br/>api.acme.com]
    RP --> S1[Backend 1]
    RP --> S2[Backend 2]
    RP --> S3[Backend N]
    style RP fill:#15803d,color:#fff

Every public web service runs at least one. Nginx, HAProxy, Envoy, Traefik, AWS ALB/NLB, Cloudflare, Akamai — all are reverse proxies with varied feature sets.

The reverse proxy terminates TLS so backends don't have to manage certificates, then distributes requests across a pool of healthy servers, routing based on URL path or Host header when different paths map to different services. It caches static assets and safe API responses, rate-limits per IP or API key before the request ever reaches your application, validates JWTs at the edge so every backend doesn't need to re-implement auth, and compresses responses on the way out. On the operational side it runs health checks continuously and stops sending traffic to any backend that starts failing. It also makes deployments safer — sending 1% of traffic to a new backend version while keeping 99% on the old one lets you validate a release with real traffic before fully cutting over.

What makes the reverse proxy powerful is that none of these capabilities require the backends to know about them. The backends just handle business logic; the proxy handles everything from the internet's edge inward.

Side-by-side

Property	Forward proxy	Reverse proxy
Who deploys it	Client side (user / IT)	Server side (operator)
Who knows it exists	The client	The client thinks it's the server
Common reasons	Filtering, anonymity, cache	TLS, load balance, route, cache
Sees client IP	Yes	Yes (forwards via X-Forwarded-For)
Sees real server IP	Yes	Hidden from client
Sees TLS cleartext	Only if MITM	Yes (terminates TLS)
Scales the server	No	Yes
Hides the server	No	Yes

Load balancer vs reverse proxy

A load balancer is a reverse proxy, specialized for distributing load. The terms overlap heavily.

flowchart TD
    P[Proxies that face clients] --> RP[Reverse proxy<br/>routes + caches + terminates]
    P --> LB[Load balancer<br/>distributes load]
    P --> AG[API Gateway<br/>routes + auth + transforms]
    P --> WAF[WAF<br/>filters malicious requests]
    P --> CDN[CDN<br/>cache at edge]
    style RP fill:#15803d,color:#fff
    style LB fill:#0e7490,color:#fff
    style AG fill:#ff6b1a,color:#0a0a0f

The practical distinction comes down to how much HTTP the proxy understands.

A pure L4 load balancer (HAProxy in tcp mode, AWS NLB) operates at the TCP level. It distributes connections without parsing HTTP, which makes it extremely fast. But it cannot route by URL path or inject headers, because it never looks past the transport layer. One subtle point: HAProxy in tcp mode still terminates the TCP connection and opens a new one to the backend — it is not a transparent pass-through. This means the backend sees HAProxy's IP unless you enable the PROXY protocol to carry the real client IP forward. AWS NLB handles this differently and can preserve the client source IP natively.

An L7 load balancer (Nginx, Envoy, AWS ALB) speaks HTTP. It can route /api to one backend and /static to another, add X-Forwarded-For headers, enforce retries on 5xx responses, and terminate TLS. The cost is a bit more latency and CPU per request; the payoff is much more control.

An API Gateway takes the L7 proxy and adds opinions about API-shaped traffic: auth, per-key rate limiting, request and response transformation, OpenAPI integration. Every API gateway is a reverse proxy. Not every reverse proxy is an API gateway.

The load balancer module covers L4 vs L7 in depth.

L4 vs L7: what the proxy actually sees

To make the L4/L7 split concrete, here is what each proxy inspects and what it can act on:

flowchart LR
    PKT[Incoming packet] --> L4{L4 proxy}
    L4 -->|"reads: src IP, dst IP, port"| L4OUT["Routes by TCP tuple<br/>Cannot see path or headers"]
    L4OUT --> BACK1[Backend pool]
    PKT2[Incoming request] --> L7{L7 proxy}
    L7 -->|"reads: HTTP verb, path, headers, body"| L7OUT["Routes by /api vs /static<br/>Injects headers, retries 5xx"]
    L7OUT --> BACK2[Backend pool]
    style L4 fill:#0e7490,color:#fff
    style L7 fill:#ff6b1a,color:#0a0a0f
    style L4OUT fill:#ffaa00,color:#0a0a0f
    style L7OUT fill:#15803d,color:#fff

L4 is simpler and faster; L7 is more expensive per request but gives you everything you need to run a modern microservices architecture.

API Gateway in depth

An API Gateway sits between clients and your microservices. Its job is to make N services look like one cohesive API.

flowchart LR
    C[Mobile / Web clients] --> AG[API Gateway]
    AG -->|"/users"| U[User Service]
    AG -->|"/orders"| O[Order Service]
    AG -->|"/payments"| P[Payment Service]
    AG -->|"/shipping"| S[Shipping Service]
    style AG fill:#ff6b1a,color:#0a0a0f

A request lands at the gateway and the gateway does all the cross-cutting work before anything reaches a microservice: parse the JWT or API key, check rate limits, route to the right service by path, transform the request if the backend expects a different shape, and inject a trace ID so you can follow the request through your observability stack. On the way back out it strips internal fields the client should not see and, if multiple backend calls were needed, assembles them into one response.

Versioning is a natural fit too — route /v1/* to an old backend and /v2/* to a new one, letting you migrate incrementally.

API Gateway failure modes

Because the gateway sits in the critical path for every request, it deserves the same reliability thinking you'd apply to a database. A few patterns worth internalizing:

Centralizing JWT validation through a remote auth service adds one network hop to every request. The fix is validating JWTs locally — the gateway caches the public key and verifies signatures in-process, with no remote call needed for already-issued tokens.

Every auth plugin, rate-limiter, and transformer in a middleware chain adds latency. A 20-plugin chain at 0.5 ms each is 10 ms bolted onto every request. Profile the chain and prune what isn't earning its cost.

The gateway itself needs to run behind a load balancer. Design it for graceful degradation under overload — shed non-critical traffic and return 429 rather than letting the gateway time out and cascade failures downstream.

Rolling deploys matter at the gateway specifically because all upstream connections reset simultaneously during a cut-over. Without connection draining, backends see a spike of new connection attempts all at once.

Backend-for-frontend (BFF)

A variant: instead of one gateway for all clients, you run a thin gateway per client type. The mobile app talks to a mobile BFF; the web app talks to a web BFF; both BFFs talk to the same backends but shape responses for their client. Each client gets exactly the payload shape it needs, and the field teams have autonomy over their own BFF. The tradeoff is more services to maintain, and if different teams share a single BFF without coordination it can quietly become a new shared bottleneck.

Popular API Gateways

Tool	Strengths
Kong	Open source, plugin ecosystem
AWS API Gateway	Managed, deep AWS integration
Tyk	Open source, multi-tenant
Apigee (Google)	Enterprise, analytics
Envoy (with Istio / Contour)	Modern, programmable
Cloudflare Workers + API Shield	Edge-first
Nginx + Lua / OpenResty	DIY, very fast

CDN — a special reverse proxy

A Content Delivery Network is a reverse proxy with hundreds of geographically distributed points of presence (Cloudflare operates in 337 cities across 8 regions as of mid-2026; Akamai and AWS CloudFront are in the same ballpark).

flowchart LR
    U1[User Tokyo] --> CDN_TK[CDN Tokyo edge]
    U2[User London] --> CDN_LN[CDN London edge]
    U3[User SF] --> CDN_SF[CDN SF edge]
    CDN_TK -->|"miss"| ORIGIN[Origin]
    CDN_LN -.miss.-> ORIGIN
    CDN_SF -.miss.-> ORIGIN
    style CDN_TK fill:#15803d,color:#fff
    style CDN_LN fill:#15803d,color:#fff
    style CDN_SF fill:#15803d,color:#fff

The CDN's core trick is simple: serve from the nearest edge when you can, forward to origin when you can't. For a user in Tokyo, TLS termination at the Tokyo edge collapses what would be a 150 ms intercontinental round-trip into a handful of milliseconds. Your origin sees only cache misses — often less than 1% of total traffic for a well-cached site.

On top of the reverse proxy baseline, CDNs add geographic distribution across hundreds of PoPs, petabyte-scale caching of static and dynamic content, DDoS protection through anycast (a flood gets diluted across many locations before it can concentrate), and increasingly, edge compute so you can run logic like A/B tests or auth checks before the request leaves the continent (Cloudflare Workers, Lambda@Edge).

A real architecture stack

Here's what a production web stack often looks like:

flowchart LR
    C[Clients] --> CDN[CDN<br/>Cloudflare / Fastly]
    CDN --> WAF[WAF<br/>filter malicious]
    WAF --> LB[L4 Load Balancer<br/>NLB / GLB]
    LB --> RP[L7 Reverse Proxy<br/>Nginx / ALB / Envoy]
    RP --> AG[API Gateway<br/>auth, rate limit]
    AG --> APP[App Servers]
    style CDN fill:#ff6b1a,color:#0a0a0f
    style RP fill:#15803d,color:#fff
    style AG fill:#0e7490,color:#fff

Five different "proxies" all serving the same request. Each does one job well. You don't need all of them — small services collapse this into one or two layers.

Forward proxy security: the MITM dance

When a corporate forward proxy needs to inspect TLS traffic, it performs TLS interception. The mechanics look like this:

sequenceDiagram
    participant L as Laptop
    participant P as Corporate Proxy
    participant S as bank.com
    L->>P: TLS ClientHello (wants bank.com)
    P->>S: TLS ClientHello (opens its own connection)
    S-->>P: Server cert (bank.com)
    P-->>L: Fake cert for bank.com (signed by corporate CA)
    Note over L,P: Laptop trusts fake cert because<br/>corporate root CA is pre-installed
    L->>P: Encrypted request (bank.com)
    P->>P: Decrypt, inspect, log
    P->>S: Re-encrypt + forward
    S-->>P: Encrypted response
    P->>P: Decrypt, inspect
    P-->>L: Re-encrypt + deliver

The laptop is configured to trust a corporate root CA. The proxy issues fake certs for every site on the fly. The laptop sees "valid" TLS to the proxy; the proxy sees "valid" TLS to the real site. Cleartext lives at the proxy.

This is legitimate corporate practice. It's also indistinguishable from a malicious MITM if you don't know to look. The "Cisco Umbrella" or "Zscaler" cert chain that sometimes appears in a browser on a corporate device is exactly this mechanism — made visible by clicking the lock icon.

Things that confuse beginners

"Is Nginx a forward or reverse proxy?"

Primarily a reverse proxy. Nginx's native mode is reverse proxying. For open-source Nginx, forward proxy support (specifically the HTTP CONNECT method used for HTTPS tunneling) requires the third-party ngx_http_proxy_connect_module compiled in — it is not available in standard Nginx packages. NGINX Plus R36+ ships a built-in ngx_http_tunnel_module that provides this natively, but that is the commercial product. In practice, almost all Nginx deployments are reverse proxies.

"Is a VPN a proxy?"

Similar effect, different layer. A VPN creates an encrypted L3/L4 tunnel — the OS routes all traffic (any protocol, any port) through it, not just HTTP. A forward proxy is typically L7 and is configured per application. In both cases the remote exit node sees the destination as if it were the direct client, so the outcome (IP masking, egress routing) looks the same from the outside. The architectural difference is: forward proxy = application-layer interception; VPN = network-layer tunnel covering every protocol.

"What's the difference between API Gateway and reverse proxy?"

API Gateway is a reverse proxy that knows about API-shaped traffic. The differences are feature richness, not architecture.

"What's the difference between Load Balancer and Reverse Proxy?"

Load balancer is a use case of reverse proxy (sometimes at L4 where it's not strictly a "proxy" in the HTTP sense). Everyday usage blurs the terms.

Performance considerations

TLS termination cost has two components that are easy to conflate. The first is crypto CPU overhead — roughly 1–2 ms of server-side computation per new handshake at typical RSA/ECDHE key sizes on modern hardware with hardware acceleration (AVX2/AVX-512 for elliptic-curve math, dedicated crypto ASICs, etc.), which is manageable at scale; without hardware offload or under heavy RSA load, this can be significantly higher. The second is round-trip latency: TLS 1.2 adds 2 RTTs after the TCP connection is established; TLS 1.3 adds 1 RTT (0-RTT resumption possible for returning clients). Terminating TLS at a geographically close edge node is the biggest win because it collapses intercontinental RTTs into local ones. Session tickets and HTTP/2 multiplexing reduce both costs substantially. Don't confuse "cheap at the terminator" with "free for the client" — the latency difference is felt in the network, not the CPU.

Each proxy hop adds a TCP connection and any processing time, typically 0.5–5 ms on the same data-center fabric. Stack five layers only when each one earns its keep.

Between proxy and backend, reuse persistent TCP (and HTTP/2) connections. Without pooling, each request triggers a new TCP and TLS handshake to the backend and you rapidly exhaust ephemeral port ranges at high QPS. The Linux default ephemeral range is 32768–60999, giving roughly 28k ports per IP pair; the theoretical max if you widen the range is ~64k.

Proxies that buffer the full request body reduce backend load spikes — slow clients don't hold backend threads open — but break streaming and add head-of-line latency. Configure per route: buffer uploads, stream video.

When to add a proxy

Add a reverse proxy when you have multiple backends behind one address, TLS termination needs centralization, you want cross-service routing or caching or rate limiting, or you need backends to be unreachable directly from the internet.

Skip it when you have one service and one server — a proxy is just another hop and another thing to configure. If the cloud load balancer you already have does what you need, a dedicated reverse proxy layer is overhead you don't need to carry.

Add an API Gateway when you have many microservices and many clients, when per-API-key rate limiting or billing needs to happen somewhere, or when auth needs to consolidate in one place. Skip it for a monolith (your app layer is already the central place) or when you're running three services with a small team — the operational overhead tends to exceed the benefit until you hit a threshold of client diversity or service count.

Things you should now be able to answer

What's the one-line difference between a forward and reverse proxy?
Why does a corporate forward proxy MITM TLS?
What does an API Gateway add over a reverse proxy?
Why does a CDN count as a reverse proxy?
What's TLS termination, and why is it usually at the edge?

Frequently asked questions

▸What is the core difference between a forward proxy and a reverse proxy?

A forward proxy represents the client: the client explicitly routes egress through it, and the destination server sees only the proxy's IP. A reverse proxy represents the server: the server operator deploys it, the client thinks it is the server, and the real backends remain hidden behind it.

▸When should you add an API Gateway instead of a plain reverse proxy?

Add an API Gateway when you have many microservices and many clients, need per-API-key rate limiting or billing, or want auth consolidated in one place. Skip it for a monolith or a small team running three services — the operational overhead exceeds the benefit until you hit a meaningful threshold of client diversity or service count.

▸How does a corporate forward proxy inspect HTTPS traffic?

It performs TLS interception: the proxy terminates the TLS session from the laptop, decrypts the request, re-encrypts toward the real destination, and holds cleartext in the middle. This works because IT pre-installs a corporate root CA on every device, causing the laptop to trust the proxy's dynamically generated per-site certificates — the Cisco Umbrella or Zscaler cert chain visible in a browser on a corporate machine is exactly this mechanism.

▸How many cities does Cloudflare's CDN operate in, and why does geographic distribution matter for TLS?

Cloudflare operates in 337 cities across 8 regions as of mid-2026. Terminating TLS at a geographically close edge collapses what would be a 150 ms intercontinental round-trip into a handful of milliseconds, and the origin server then sees only cache misses — often less than 1% of total traffic for a well-cached site.

▸What is the practical difference between an L4 and an L7 load balancer?

An L4 load balancer operates at the TCP level, distributing connections without parsing HTTP, which makes it very fast but unable to route by URL path or inject headers. An L7 load balancer speaks HTTP and can route /api to one backend and /static to another, add X-Forwarded-For headers, enforce retries on 5xx responses, and terminate TLS — at the cost of more CPU and latency per request.

← previous

Designing a Feature Flag Service

The Saga Pattern & Distributed Transactions

// RELATED