Web Servers and Reverse Proxies¶
Web servers and reverse proxies are critical infrastructure components that sit between clients and application servers, handling HTTP traffic, SSL/TLS termination, load balancing, static file serving, and more. Understanding how they work is essential for deploying and operating web applications in production.
How Web Traffic Flows in Production¶
A typical production request flow involves multiple layers, each adding functionality (and latency):
Client (Browser/App)
│
▼
DNS Resolution (Route 53, Cloudflare)
│ Recursive resolver → Root → TLD → Authoritative
│ Cached at each layer (TTL-based)
▼
CDN (CloudFront, Cloudflare) ──── Cache HIT? → Return cached response
│ Cache MISS
▼
Load Balancer (ALB, Nginx, HAProxy)
│ Distributes across healthy backends
│ L4 (TCP) or L7 (HTTP) routing
▼
Reverse Proxy / Web Server (Nginx, Caddy)
│ - SSL/TLS termination
│ - Static file serving
│ - Request routing
│ - Rate limiting
│ - Compression (gzip/brotli)
│ - Request/response header manipulation
▼
Application Server (Gunicorn, uvicorn, Node.js, Actix)
│ Business logic execution
▼
Backend Services (Database, Cache, Message Queue, External APIs)
DNS Resolution in Detail¶
When a browser needs to resolve api.example.com:
- Browser cache: Checks its own DNS cache (Chrome:
chrome://net-internals/#dns) - OS cache: Checks the operating system's DNS resolver cache
- Recursive resolver: Sends query to configured DNS server (ISP, 8.8.8.8, 1.1.1.1)
- Root nameserver: Directs to the
.comTLD nameserver - TLD nameserver: Directs to
example.com's authoritative nameserver - Authoritative nameserver: Returns the IP address (or CNAME, then another lookup)
Each answer includes a TTL (Time to Live) that determines how long the result is cached. Short TTLs (60s) enable fast failover but increase DNS query load; long TTLs (3600s) reduce load but delay propagation of changes.
HTTP Protocol Versions¶
| Version | Multiplexing | Header Compression | Transport | Key Improvement |
|---|---|---|---|---|
| HTTP/1.1 | No (one request per TCP connection, or pipelining with head-of-line blocking) | No | TCP | Persistent connections, chunked transfer |
| HTTP/2 | Yes (multiple streams over one TCP connection) | HPACK | TCP | Eliminates HOL blocking at HTTP level, server push |
| HTTP/3 | Yes (multiple streams over QUIC) | QPACK | QUIC (UDP) | Eliminates TCP HOL blocking, faster handshakes (0-RTT) |
HTTP/2 is the baseline for modern web serving. It reduces latency by multiplexing requests over a single TCP connection, compressing headers, and allowing server push. However, TCP's head-of-line blocking means a single lost packet stalls all streams.
HTTP/3 solves this by using QUIC (built on UDP), where each stream is independent—a lost packet only stalls its own stream. QUIC also combines the TLS and transport handshakes, achieving 0-RTT connection establishment for repeat visitors.
The TLS Handshake¶
Every HTTPS connection begins with a TLS handshake that establishes encryption:
TLS 1.3 Handshake (1 round trip):
Client Server
│ │
│──── ClientHello ──────────────────▶│ Supported cipher suites, key share
│ │
│◀─── ServerHello + Certificate ─────│ Chosen cipher, server cert, key share
│ EncryptedExtensions │
│ Finished │
│ │
│──── Finished ─────────────────────▶│ Client confirms
│ │
│◀═══ Encrypted Application Data ═══▶│ All subsequent data encrypted
TLS 1.3 reduced the handshake from 2 round trips (TLS 1.2) to 1 round trip, and supports 0-RTT (zero round trip time) for repeat connections by caching session parameters. The trade-off with 0-RTT is replay attack vulnerability—it should only be used for idempotent requests (GET).
Nginx¶
Nginx (pronounced "engine-x") is the most widely used web server and reverse proxy, serving approximately 34% of all websites. It uses an asynchronous, event-driven architecture that handles thousands of concurrent connections efficiently with minimal memory overhead.
Core Architecture¶
Nginx uses a master-worker process model:
- Master process: Reads configuration, binds to ports, manages worker processes. Runs as root (needs root for port 80/443) but workers run as unprivileged user (
nginxorwww-data). - Worker processes: Handle actual HTTP requests. Each worker is single-threaded but uses non-blocking I/O (
epollon Linux,kqueueon BSD/macOS) to handle thousands of connections concurrently.
┌── Master Process ──┐
│ Config loading │
│ Port binding │
│ Worker management │
└────────┬───────────┘
┌───────────┼───────────┐
▼ ▼ ▼
┌─Worker 1─┐ ┌─Worker 2┐ ┌─Worker N─┐
│ Event │ │ Event │ │ Event │
│ Loop │ │ Loop │ │ Loop │
│ (epoll) │ │ (epoll) │ │ (epoll) │
│ │ │ │ │ │
│ 1000s of │ │ 1000s │ │ 1000s of │
│ connections│ │of conn │ │ connections│
└──────────┘ └─────────┘ └──────────┘
Why event-driven is efficient: Traditional web servers (Apache prefork) spawn a process per connection—10,000 connections means 10,000 processes, each consuming ~10MB of memory. Nginx handles 10,000 connections in a single worker process using ~25MB total, because it never blocks waiting for I/O. Instead, it registers callbacks with the OS kernel (epoll_wait) and processes events as they become ready.
How epoll works: The Linux kernel's epoll system call efficiently monitors thousands of file descriptors. When a worker calls epoll_wait(), the kernel returns only the file descriptors that have activity (data received, write buffer available). The worker processes these events, initiates non-blocking I/O operations, and loops back to epoll_wait(). This means a single thread can manage thousands of concurrent connections with near-zero idle overhead.
Configuration Deep Dive¶
Location Matching¶
Nginx evaluates location blocks in a specific priority order:
# Priority order (highest to lowest):
# 1. Exact match (=)
location = /favicon.ico { } # Only matches exactly "/favicon.ico"
# 2. Preferential prefix (^~)
location ^~ /static/ { } # Prefix match, stops regex search
# 3. Regular expression (~ case-sensitive, ~* case-insensitive)
location ~ \.php$ { } # Matches *.php (case-sensitive)
location ~* \.(jpg|png|gif)$ { } # Matches image files (case-insensitive)
# 4. Prefix match (longest wins)
location /api/ { } # Matches /api/anything
location /api/v2/ { } # Matches /api/v2/anything (longer prefix wins)
location / { } # Catch-all (shortest prefix)
The matching algorithm: Nginx first checks all prefix locations and remembers the longest match. Then it checks regex locations in config order—the first regex match wins. If no regex matches, the longest prefix match is used. The = modifier skips all other checks, and ^~ skips regex checking.
Upstream Configuration¶
upstream app_backend {
# Load balancing algorithm
least_conn; # Route to server with fewest active connections
# Other options: round_robin (default), ip_hash, hash, random
# Backend servers with configuration
server 127.0.0.1:8001 weight=3; # Gets 3x traffic
server 127.0.0.1:8002 weight=1; # Gets 1x traffic
server 127.0.0.1:8003 backup; # Only used if others are down
server 127.0.0.1:8004 down; # Marked as permanently unavailable
# Health checking (passive — based on actual traffic)
# max_fails: number of failed attempts before marking server as unavailable
# fail_timeout: time window for max_fails AND how long to mark server as down
server 127.0.0.1:8005 max_fails=3 fail_timeout=30s;
# Keep-alive connections to upstream (reduces TCP handshake overhead)
keepalive 32; # Pool of 32 keep-alive connections per worker
keepalive_timeout 60s; # Close idle connections after 60 seconds
keepalive_requests 1000; # Max requests per keep-alive connection
}
Caching Configuration¶
# Define a cache zone (in http block)
proxy_cache_path /var/cache/nginx/app
levels=1:2 # Two-level directory hashing
keys_zone=app_cache:10m # 10MB of shared memory for cache keys (~80,000 keys)
max_size=1g # Maximum disk usage
inactive=60m # Remove cached items not accessed in 60 minutes
use_temp_path=off; # Write directly to cache dir (better performance)
server {
location /api/ {
proxy_pass http://app_backend;
proxy_cache app_cache;
# Cache configuration
proxy_cache_valid 200 10m; # Cache 200 responses for 10 minutes
proxy_cache_valid 404 1m; # Cache 404 responses for 1 minute
proxy_cache_valid any 5m; # Cache everything else for 5 minutes
proxy_cache_key "$scheme$request_method$host$request_uri";
proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
# ↑ Serve stale cached content when backend is erroring or updating
# Add headers to show cache status
add_header X-Cache-Status $upstream_cache_status;
# Values: MISS, HIT, STALE, UPDATING, BYPASS, EXPIRED, REVALIDATED
# Bypass cache for specific conditions
proxy_cache_bypass $http_authorization; # Don't cache authenticated requests
proxy_no_cache $http_authorization;
}
}
WebSocket Proxying¶
WebSocket connections require special proxy configuration because they upgrade from HTTP to a persistent bidirectional protocol:
location /ws/ {
proxy_pass http://app_backend;
proxy_http_version 1.1;
# These headers are required for WebSocket upgrade
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# WebSocket connections are long-lived — increase timeouts
proxy_read_timeout 3600s; # 1 hour
proxy_send_timeout 3600s;
}
Performance Tuning¶
# /etc/nginx/nginx.conf — Performance-optimized configuration
# Worker processes: one per CPU core
worker_processes auto;
# File descriptor limit per worker (must be >= worker_connections * 2)
worker_rlimit_nofile 65535;
events {
worker_connections 4096; # Max connections per worker
use epoll; # Linux kernel event notification mechanism
multi_accept on; # Accept multiple connections at once
}
http {
# Sendfile: let the kernel send files directly from disk to socket
# bypassing user-space (zero-copy). Crucial for static file performance.
sendfile on;
# tcp_nopush: send headers and the beginning of a file in one TCP packet
# Works with sendfile. Reduces number of packets for large file transfers.
tcp_nopush on;
# tcp_nodelay: disable Nagle's algorithm — send data immediately
# without waiting to fill a TCP packet. Reduces latency for small responses.
tcp_nodelay on;
# Keep-alive optimization
keepalive_timeout 65; # Close idle connections after 65 seconds
keepalive_requests 10000; # Max requests per keep-alive connection
# Buffer optimization
client_body_buffer_size 16k; # Buffer for client request bodies
client_header_buffer_size 1k; # Buffer for client request headers
large_client_header_buffers 4 8k; # For large headers (cookies, auth tokens)
client_max_body_size 10m; # Max upload size
# Proxy buffer optimization
proxy_buffer_size 4k; # Buffer for first part of response (headers)
proxy_buffers 8 16k; # 8 buffers of 16k for response body
proxy_busy_buffers_size 32k; # Max size of buffers being sent to client
# Compression
gzip on;
gzip_vary on; # Add Vary: Accept-Encoding header
gzip_proxied any; # Compress proxied responses too
gzip_comp_level 4; # 1-9 (4 is good balance of CPU vs compression)
gzip_min_length 1000; # Don't compress tiny responses
gzip_types
text/plain
text/css
text/xml
text/javascript
application/json
application/javascript
application/xml
application/rss+xml
image/svg+xml;
# Logging (optional: disable access log for high-traffic static assets)
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log warn;
}
Worker connections calculation: Each proxy connection uses 2 file descriptors (one for client, one for upstream). So worker_connections 4096 supports ~2048 simultaneous proxy connections per worker. With 4 workers, that's ~8192 concurrent connections. Adjust worker_rlimit_nofile to at least worker_connections * 2.
Security Configuration¶
# Rate limiting: multiple zones for different endpoints
limit_req_zone $binary_remote_addr zone=general:10m rate=30r/s;
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;
# Connection limiting
limit_conn_zone $binary_remote_addr zone=addr:10m;
server {
listen 443 ssl http2;
server_name example.com;
# SSL/TLS Configuration (modern)
ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';
ssl_prefer_server_ciphers off; # Let client choose (TLS 1.3 ignores this)
# OCSP Stapling: server fetches certificate revocation status and sends it
# to clients, avoiding the client needing to contact the CA
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/letsencrypt/live/example.com/chain.pem;
resolver 8.8.8.8 8.8.4.4 valid=300s;
# Session caching: avoid repeating full TLS handshake for returning clients
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
ssl_session_tickets off; # Disable for forward secrecy
# Security Headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'" always;
add_header Permissions-Policy "camera=(), microphone=(), geolocation=()" always;
# Hide Nginx version
server_tokens off;
# Request size limit (prevent upload abuse)
client_max_body_size 10m;
# Rate limiting applied per location
location /api/auth/ {
limit_req zone=auth burst=3 nodelay;
limit_conn addr 5;
proxy_pass http://app_backend;
}
location /api/ {
limit_req zone=api burst=20 nodelay;
# burst=20: allow bursts of 20 requests beyond the rate
# nodelay: process burst requests immediately (don't queue them)
proxy_pass http://app_backend;
}
}
Rate limiting explained: rate=100r/s creates a token bucket that fills at 100 tokens per second. Each request consumes one token. burst=20 adds a buffer of 20 tokens. Without nodelay, excess requests within the burst are delayed (queued); with nodelay, they're processed immediately but the burst bucket still drains at the configured rate.
Production Configuration Example¶
# /etc/nginx/nginx.conf — Complete production configuration
user nginx;
worker_processes auto;
pid /run/nginx.pid;
worker_rlimit_nofile 65535;
error_log /var/log/nginx/error.log warn;
events {
worker_connections 4096;
use epoll;
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Logging format with timing info
log_format main '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'rt=$request_time uct=$upstream_connect_time '
'uht=$upstream_header_time urt=$upstream_response_time';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
keepalive_requests 10000;
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 4;
gzip_min_length 1000;
gzip_types text/plain text/css text/xml text/javascript
application/json application/javascript application/xml
image/svg+xml;
# Rate limiting zones
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;
# Upstream backends
upstream app_backend {
least_conn;
server 127.0.0.1:8001 max_fails=3 fail_timeout=30s;
server 127.0.0.1:8002 max_fails=3 fail_timeout=30s;
server 127.0.0.1:8003 max_fails=3 fail_timeout=30s;
keepalive 64;
}
# HTTP → HTTPS redirect
server {
listen 80;
server_name example.com www.example.com;
return 301 https://example.com$request_uri;
}
# www → non-www redirect
server {
listen 443 ssl http2;
server_name www.example.com;
ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
return 301 https://example.com$request_uri;
}
# Main server block
server {
listen 443 ssl http2;
server_name example.com;
ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
ssl_stapling on;
ssl_stapling_verify on;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
server_tokens off;
# Static files
location /static/ {
alias /var/www/app/static/;
expires 365d;
add_header Cache-Control "public, immutable";
access_log off;
}
# Health check endpoint (for load balancer)
location = /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
# Auth endpoints (strict rate limiting)
location /api/auth/ {
limit_req zone=auth burst=3 nodelay;
proxy_pass http://app_backend;
include /etc/nginx/proxy_params;
}
# API endpoints
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://app_backend;
include /etc/nginx/proxy_params;
}
# WebSocket
location /ws/ {
proxy_pass http://app_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 3600s;
include /etc/nginx/proxy_params;
}
# Frontend (SPA)
location / {
root /var/www/app/dist;
try_files $uri $uri/ /index.html;
}
}
}
Apache HTTP Server¶
Apache is one of the oldest and most configurable web servers (since 1995). Unlike Nginx's event-driven model, Apache traditionally uses a process-per-connection (prefork) or thread-per-connection (worker) model, though the modern event MPM is event-driven.
Multi-Processing Modules (MPMs)¶
| MPM | Model | Connections | Memory | Use Case |
|---|---|---|---|---|
| prefork | One process per connection | Low (hundreds) | High (~10MB/process) | PHP mod_php (not thread-safe), maximum stability |
| worker | Thread pool per process | Medium (thousands) | Medium (~1MB/thread) | Thread-safe applications, moderate traffic |
| event | Event-driven + thread pool | High (thousands) | Lower | Modern default, similar to Nginx for keep-alive |
The event MPM is Apache's answer to Nginx's efficiency. It uses dedicated threads to handle keep-alive connections, freeing worker threads to process requests. For keep-alive-heavy workloads, event MPM dramatically reduces resource usage compared to prefork.
Apache vs. Nginx¶
| Feature | Nginx | Apache |
|---|---|---|
| Architecture | Event-driven, async | Process/thread per connection (or event MPM) |
| Performance | Excellent for static content, reverse proxy | Good, but higher memory per connection |
| Configuration | Centralized files, requires reload | .htaccess for per-directory overrides (no reload) |
| Module loading | Compiled-in or dynamic (limited) | Highly modular, dynamic loading at runtime |
| URL rewriting | rewrite directive (simpler) |
mod_rewrite (powerful but complex regex engine) |
| Use case | Reverse proxy, load balancer, static files | Dynamic content (mod_php), .htaccess flexibility |
| Market share | ~34% | ~29% (declining) |
When to use Apache: Apache's .htaccess files allow per-directory configuration without reloading the server—useful for shared hosting where users need to configure their own URL rewrites, authentication, and caching rules. However, .htaccess has a performance cost: Apache checks for .htaccess files in every directory in the path for every request.
mod_rewrite¶
Apache's mod_rewrite is a powerful URL rewriting engine used for redirects, pretty URLs, and complex routing:
# .htaccess — Common mod_rewrite patterns
RewriteEngine On
# Force HTTPS
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
# Remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]
# SPA fallback (route all non-file requests to index.html)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.html [L]
# Proxy API requests to backend
RewriteRule ^api/(.*)$ http://localhost:8080/api/$1 [P,L]
Caddy¶
Caddy is a modern web server written in Go, notable for automatic HTTPS (obtains and renews Let's Encrypt certificates automatically with zero configuration).
Automatic HTTPS Internals¶
Caddy implements the ACME (Automatic Certificate Management Environment) protocol to obtain certificates from Let's Encrypt:
- Challenge: Caddy proves domain ownership using one of:
- HTTP-01: Place a token at
http://domain/.well-known/acme-challenge/TOKEN(requires port 80) - TLS-ALPN-01: Present a self-signed certificate with a specific ALPN extension during TLS handshake (requires port 443)
- DNS-01: Create a DNS TXT record
_acme-challenge.domain(works behind firewalls, supports wildcards) - Issuance: Once verified, Let's Encrypt issues the certificate (valid for 90 days)
- Renewal: Caddy automatically renews certificates ~30 days before expiration
- OCSP stapling: Caddy automatically staples OCSP responses
Caddyfile Configuration¶
# Caddyfile — Caddy's configuration format
example.com {
# Automatic HTTPS — no SSL config needed!
# Reverse proxy to application
reverse_proxy /api/* localhost:8080 {
# Load balancing
lb_policy least_conn
# Health checking
health_uri /health
health_interval 10s
health_timeout 5s
# Headers
header_up X-Real-IP {remote_host}
header_up X-Forwarded-Proto {scheme}
# Transport configuration
transport http {
keepalive 30s
keepalive_idle_conns 64
}
}
# Static file serving
root * /var/www/site
file_server {
hide .git .env
}
# Compression
encode gzip zstd
# Security headers
header {
X-Frame-Options "SAMEORIGIN"
X-Content-Type-Options "nosniff"
Strict-Transport-Security "max-age=31536000"
-Server # Remove Server header
}
# Rate limiting (with caddy-ratelimit plugin)
rate_limit {remote.host} 100r/m
# Logging
log {
output file /var/log/caddy/access.log {
roll_size 100mb
roll_keep 5
}
format json
}
}
Caddy vs. Nginx: Caddy's primary advantages are automatic HTTPS, simpler configuration syntax, and a single binary with no dependencies. Nginx has a larger ecosystem, more community resources, and marginally better raw performance for extremely high-throughput scenarios. For most applications, the difference is negligible, and Caddy's simplicity reduces operational risk.
HAProxy¶
HAProxy (High Availability Proxy) is a dedicated, high-performance load balancer and proxy. While Nginx handles both web serving and proxying, HAProxy is purpose-built for proxying and load balancing, and excels at very high connection rates (millions of concurrent connections).
Architecture¶
HAProxy uses a single-process, event-driven model (similar to Nginx workers). It's designed for maximum throughput and minimum latency in the proxy path.
Configuration¶
# /etc/haproxy/haproxy.cfg
global
log /dev/log local0
maxconn 50000
user haproxy
group haproxy
daemon
# SSL tuning
ssl-default-bind-options ssl-min-ver TLSv1.2
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
tune.ssl.default-dh-param 2048
defaults
log global
mode http # Layer 7 mode (use 'tcp' for Layer 4)
option httplog # Detailed HTTP logging
option dontlognull # Don't log health check connections
option forwardfor # Add X-Forwarded-For header
timeout connect 5s
timeout client 30s
timeout server 30s
timeout http-keep-alive 10s
timeout http-request 10s # Max time for client to send full request
# Retry on connection failure
retries 3
option redispatch # Retry on a different server if one fails
# Stats dashboard (accessible at :8404/stats)
frontend stats
bind *:8404
stats enable
stats uri /stats
stats refresh 10s
stats auth admin:secretpassword
# Frontend: incoming traffic
frontend http_front
bind *:80
bind *:443 ssl crt /etc/ssl/certs/example.pem
redirect scheme https if !{ ssl_fc }
# Route based on host header
acl is_api hdr(host) -i api.example.com
acl is_web hdr(host) -i www.example.com
use_backend api_servers if is_api
default_backend web_servers
# Backend: API servers
backend api_servers
balance leastconn
option httpchk GET /health # Active health checking
# Health check configuration
default-server inter 5s fall 3 rise 2
# inter: check every 5s | fall: 3 failures = down | rise: 2 successes = up
server api1 10.0.1.10:8080 check weight 100
server api2 10.0.1.11:8080 check weight 100
server api3 10.0.1.12:8080 check weight 50 # Smaller instance, less traffic
# Connection draining: when a server is marked "drain",
# existing connections finish but no new ones are routed
# Useful during deployments: mark server drain → wait → deploy → mark ready
# Backend: Web servers
backend web_servers
balance roundrobin
option httpchk GET /health
cookie SRVID insert indirect nocache # Session persistence via cookie
server web1 10.0.2.10:3000 check cookie web1
server web2 10.0.2.11:3000 check cookie web2
HAProxy Stick Tables¶
Stick tables track connection information for rate limiting, session persistence, and abuse detection:
# Rate limiting with stick tables
frontend http_front
bind *:443 ssl crt /etc/ssl/certs/example.pem
# Track request rates per IP
stick-table type ip size 100k expire 30s store http_req_rate(10s)
http-request track-sc0 src
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 100 }
# ↑ Deny with 429 if more than 100 requests in 10 seconds from same IP
Traefik¶
Traefik is a modern reverse proxy designed for microservices and container environments. Its killer feature is automatic service discovery: it watches Docker, Kubernetes, and other orchestrators and automatically configures routing to new services.
Docker Integration¶
# docker-compose.yml with Traefik
version: '3'
services:
traefik:
image: traefik:v3.0
command:
- --api.dashboard=true
- --providers.docker=true
- --providers.docker.exposedByDefault=false
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --certificatesresolvers.letsencrypt.acme.email=admin@example.com
- --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
- --certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- letsencrypt:/letsencrypt
api:
image: myapp/api:latest
labels:
# Traefik discovers this service via Docker labels
- traefik.enable=true
- traefik.http.routers.api.rule=Host(`api.example.com`)
- traefik.http.routers.api.tls.certresolver=letsencrypt
- traefik.http.services.api.loadbalancer.server.port=8080
# Middleware: rate limiting
- traefik.http.middlewares.api-ratelimit.ratelimit.average=100
- traefik.http.middlewares.api-ratelimit.ratelimit.burst=50
- traefik.http.routers.api.middlewares=api-ratelimit
frontend:
image: myapp/frontend:latest
labels:
- traefik.enable=true
- traefik.http.routers.frontend.rule=Host(`www.example.com`)
- traefik.http.routers.frontend.tls.certresolver=letsencrypt
volumes:
letsencrypt:
Traefik vs. Nginx as Kubernetes Ingress: Traefik natively integrates with Kubernetes IngressRoute CRDs and automatically discovers services. Nginx Ingress Controller requires ConfigMaps or annotations and has a more traditional configuration model. Traefik is often preferred for dynamic environments; Nginx Ingress for environments where the configuration is more static and the Nginx ecosystem (ModSecurity, etc.) is needed.
Load Balancing Deep Dive¶
Algorithms¶
| Algorithm | Description | Best For |
|---|---|---|
| Round Robin | Distributes requests sequentially across servers | Equal-capacity servers, stateless requests |
| Weighted Round Robin | Like round robin, but servers with higher weights get more traffic | Mixed server sizes (e.g., c5.xlarge + c5.2xlarge) |
| Least Connections | Routes to the server with fewest active connections | Requests with variable processing times |
| IP Hash | Hash of client IP determines server (consistent mapping) | Session persistence without cookies |
| Consistent Hashing | Hash ring that minimizes redistribution when servers are added/removed | Caching proxies (only 1/N keys move when adding a server) |
| Random Two Choices | Pick 2 random servers, route to the one with fewer connections | Large server pools (simple, surprisingly effective) |
Consistent hashing is worth understanding deeply. Traditional hash-based routing (e.g., server = hash(client_ip) % N) breaks when N changes—adding or removing a server remaps almost all clients. Consistent hashing places servers on a virtual ring (hash values 0 to 2^32). Each request is hashed and routed to the next server clockwise on the ring. When a server is added, only its neighbors' traffic is affected (~1/N keys move instead of all).
Consistent Hash Ring:
Server A (hash: 100)
/
──────●─────────────────●── Server B (hash: 300)
│ │
│ Hash Ring │
│ (0 to 2^32) │
│ │
──────●─────────────────●── Server D (hash: 700)
\
Server C (hash: 500)
Request with hash 250 → routes to Server B (next clockwise)
Request with hash 450 → routes to Server C
Adding Server E (hash: 400) → only requests 301-400 move from C to E
Layer 4 vs. Layer 7 Load Balancing¶
| Feature | Layer 4 (TCP/UDP) | Layer 7 (HTTP/HTTPS) |
|---|---|---|
| Routing basis | IP address, port | URL path, host header, cookies, HTTP headers |
| Performance | Higher throughput, lower latency | Slightly lower throughput (must parse HTTP) |
| SSL termination | Passthrough or termination | Termination (can inspect HTTP after decryption) |
| Features | Basic health checks, connection limiting | Content-based routing, header manipulation, WAF |
| Use cases | Database load balancing, raw TCP services, ultra-high throughput | HTTP APIs, web applications, microservice routing |
| AWS service | NLB (Network Load Balancer) | ALB (Application Load Balancer) |
When to use L4: When you need the absolute highest performance (millions of requests/sec), when the backend protocol isn't HTTP (databases, custom TCP protocols), or when you want SSL passthrough (backend handles its own TLS).
When to use L7: When you need content-based routing (route /api/v1/users to user-service, /api/v1/orders to order-service), when you need to inspect or modify HTTP headers, or when you want advanced features like sticky sessions with cookies.
Health Checking¶
| Type | Mechanism | Pros | Cons |
|---|---|---|---|
| Active | Load balancer periodically sends health check requests to backends | Detects failures before user traffic is affected | Adds load to backends, may trigger rate limits |
| Passive | Monitor actual traffic responses for errors | No additional load, reflects real user experience | Requires traffic to detect failures (cold start problem) |
| Combined | Active checks supplement passive monitoring | Best of both worlds | More complex configuration |
# Nginx active health checking (requires nginx-plus or third-party module)
upstream backend {
zone backend 64k;
server 10.0.1.1:8080;
server 10.0.1.2:8080;
# Check /health every 5 seconds, consider healthy after 2 passes,
# unhealthy after 3 failures
health_check interval=5s fails=3 passes=2 uri=/health;
}
Connection Draining¶
During deployments, you need to gracefully remove servers from the load balancer pool without dropping active connections:
- Mark server as draining: Stop routing new connections to it
- Wait for active connections to complete: Set a drain timeout (e.g., 30 seconds)
- Deploy: Update the application on the drained server
- Re-add server: Mark it as healthy and start routing traffic again
This is handled automatically by Kubernetes during rolling updates (via terminationGracePeriodSeconds and preStop hooks) and by cloud load balancers (deregistration delay on target groups).
Caching at the Proxy Layer¶
Cache-Control Headers¶
HTTP caching is governed by the Cache-Control header. Understanding these directives is essential:
| Directive | Description |
|---|---|
public |
Response can be cached by any cache (CDN, proxy, browser) |
private |
Response can only be cached by the browser (not CDN/proxy) |
no-cache |
Cache can store the response BUT must revalidate with origin before serving (misleading name!) |
no-store |
Response must never be cached anywhere |
max-age=N |
Response is fresh for N seconds |
s-maxage=N |
Like max-age but only for shared caches (CDN/proxy) — overrides max-age for shared caches |
stale-while-revalidate=N |
Serve stale content for N seconds while fetching fresh content in background |
stale-if-error=N |
Serve stale content for N seconds if origin returns an error |
immutable |
Content will never change — browser should never revalidate (use with cache-busted URLs) |
Practical caching strategy:
Static assets (hashed filenames: app.a1b2c3.js):
Cache-Control: public, max-age=31536000, immutable
API responses (user-specific):
Cache-Control: private, no-cache
API responses (public, changes slowly):
Cache-Control: public, max-age=300, stale-while-revalidate=60
HTML pages:
Cache-Control: no-cache
(Always revalidate, but can use 304 Not Modified with ETag)
Authenticated API responses:
Cache-Control: private, no-store
Microcaching¶
Even caching dynamic content for 1 second can dramatically reduce backend load under high traffic. If 10,000 requests hit the same endpoint within 1 second, only 1 reaches the backend:
proxy_cache_path /var/cache/nginx/micro
levels=1:2 keys_zone=micro:1m max_size=100m;
location /api/popular-endpoint {
proxy_pass http://app_backend;
proxy_cache micro;
proxy_cache_valid 200 1s; # Cache successful responses for just 1 second
proxy_cache_lock on; # Only one request fetches from backend;
# others wait for the cached result
proxy_cache_use_stale updating; # Serve stale while refreshing
}
Reverse Proxy vs. Forward Proxy¶
| Feature | Reverse Proxy | Forward Proxy |
|---|---|---|
| Position | In front of servers | In front of clients |
| Purpose | Protect/optimize backend servers | Protect/anonymize clients |
| Use cases | Load balancing, SSL termination, caching, security, compression | Corporate firewalls, anonymity, content filtering, compliance |
| Client awareness | Client doesn't know about backend servers | Server doesn't know about actual clients |
| Configuration | Server-side (server admin configures) | Client-side (client configures proxy settings) |
| Examples | Nginx, Caddy, HAProxy, Traefik | Squid, corporate proxies, VPNs, SOCKS proxies |
Reverse proxy use cases in detail: - SSL/TLS termination: Decrypt HTTPS at the proxy so backend servers handle plain HTTP (simpler, faster backend processing) - Static file serving: Serve CSS/JS/images directly from disk without hitting the application server - Compression: gzip/brotli compress responses, reducing bandwidth usage by 60-80% - Request buffering: Buffer slow client uploads and send to backend all at once, freeing the backend connection - Security: Hide backend topology, rate limit abusive clients, filter malicious requests - A/B testing: Route percentage of traffic to different backend versions
Web Application Firewall (WAF)¶
A WAF inspects HTTP/HTTPS traffic and blocks malicious requests before they reach your application. It operates at Layer 7, understanding HTTP semantics.
What a WAF Protects Against¶
| Attack Category | Examples | WAF Rule |
|---|---|---|
| SQL Injection | ' OR 1=1 --, UNION SELECT |
Detect SQL keywords in query params/body |
| Cross-Site Scripting (XSS) | <script>alert(1)</script> |
Detect HTML/JS in input fields |
| Path Traversal | ../../etc/passwd |
Detect directory traversal patterns |
| HTTP Protocol Violations | Malformed headers, invalid methods | Enforce HTTP spec compliance |
| Bot Traffic | Scrapers, credential stuffing | Rate limiting, CAPTCHA, bot signatures |
| File Inclusion | include=http://evil.com/shell.php |
Block remote file references |
OWASP Core Rule Set (CRS) is the standard open-source WAF rule set, covering the OWASP Top 10 vulnerabilities. It runs on ModSecurity (Nginx/Apache module), Coraza (Go), and cloud WAFs.
Cloud WAFs (AWS WAF, Cloudflare WAF, Azure WAF) provide managed rule sets and are simpler to operate than self-hosted solutions. AWS WAF integrates with ALB and CloudFront, inspecting requests at the edge before they reach your backend.
False positives are the biggest operational challenge with WAFs. A legitimate request containing the word SELECT (e.g., a product named "Select Premium") might be blocked by SQL injection rules. Manage false positives with: a learning mode/logging-only period before enforcement, per-rule exceptions, URI-specific rule overrides, and regular rule tuning.
Production Deployment Patterns¶
Zero-Downtime Deployments with Nginx¶
Nginx supports graceful reloads (nginx -s reload) that enable zero-downtime configuration changes:
- Master process spawns new worker processes with the new configuration
- Master process signals old worker processes to stop accepting new connections
- Old workers finish processing active requests, then exit
- Only new workers remain, running the updated configuration
No connections are dropped during this process.
Blue-Green with Proxy Switching¶
# Blue-green deployment: switch backends by changing upstream
# Active (green):
upstream app_backend {
server 10.0.1.10:8080; # Green instances
server 10.0.1.11:8080;
}
# To switch to blue, update config and reload:
# upstream app_backend {
# server 10.0.2.10:8080; # Blue instances
# server 10.0.2.11:8080;
# }
Canary Releases via Weighted Routing¶
# Route 5% of traffic to canary, 95% to stable
upstream app_stable {
server 10.0.1.10:8080;
server 10.0.1.11:8080;
}
upstream app_canary {
server 10.0.2.10:8080;
}
split_clients "${remote_addr}" $upstream_variant {
5% canary;
* stable;
}
server {
location /api/ {
if ($upstream_variant = "canary") {
proxy_pass http://app_canary;
}
proxy_pass http://app_stable;
}
}
A/B Testing at the Proxy Layer¶
# Route users to different backends based on cookie
map $cookie_ab_group $backend {
"A" app_variant_a;
"B" app_variant_b;
default app_variant_a;
}
upstream app_variant_a {
server 10.0.1.10:8080;
}
upstream app_variant_b {
server 10.0.2.10:8080;
}
server {
location /api/ {
# Set A/B cookie if not present
if ($cookie_ab_group = "") {
add_header Set-Cookie "ab_group=A; Path=/; Max-Age=2592000";
# (In practice, randomize A/B assignment with a map or Lua script)
}
proxy_pass http://$backend;
}
}
Comparison Summary¶
| Feature | Nginx | Apache | Caddy | HAProxy | Traefik |
|---|---|---|---|---|---|
| Primary role | Web server + proxy | Web server | Web server + proxy | Load balancer + proxy | Proxy + discovery |
| Auto HTTPS | No (use certbot) | No (use certbot) | Yes (built-in) | No | Yes (built-in) |
| Config language | Custom DSL | Custom DSL | Caddyfile / JSON | Custom DSL | YAML / labels |
| Dynamic config | Reload required | .htaccess (per-dir) | API + hot reload | Reload required | Auto-discovery |
| K8s integration | Ingress controller | Limited | Ingress controller | Ingress controller | Native (CRDs) |
| Performance | Excellent | Good | Good | Excellent | Good |
| Best for | General-purpose | Legacy, mod_php | Simple setups, auto-TLS | High-perf LB | Container environments |