Skip to content

Chapter 1 — Big Picture Overview

The backend is the invisible engine behind every application. It is where business logic executes, data is stored and retrieved, security is enforced, and performance is won or lost. Understanding the backend requires understanding how multiple systems interact under load, under attack, and under the constraints of physics (network latency, disk I/O, CPU cycles).


Every interaction between a user and your system follows a path. Understanding this path — and where things can go wrong — is the foundation of backend engineering.

┌──────────┐ HTTPS ┌──────────────┐ ┌───────────────┐
│ Client │───────────────▶│ Load │─────▶│ API Gateway │
│ (Browser/ │ TLS 1.3 │ Balancer │ │ (Auth, Rate │
│ Mobile) │◀───────────────│ (L4/L7) │◀─────│ Limiting) │
└──────────┘ └──────────────┘ └───────┬───────┘
┌───────────▼───────────┐
│ Application Server │
│ ┌─────────────────┐ │
│ │ Middleware │ │
│ │ ├─ Auth check │ │
│ │ ├─ Input valid. │ │
│ │ ├─ Rate limit │ │
│ │ └─ Logging │ │
│ ├─────────────────┤ │
│ │ Business Logic │ │
│ │ ├─ Controllers │ │
│ │ ├─ Services │ │
│ │ └─ Models/ORM │ │
│ └─────────────────┘ │
└───────────┬───────────┘
┌───────────────────────────────┼───────────────────────┐
│ │ │
┌───────▼───────┐ ┌────────▼────────┐ ┌────────▼────────┐
│ Database │ │ Cache Layer │ │ Message Queue │
│ (PostgreSQL, │ │ (Redis, │ │ (RabbitMQ, │
│ MySQL) │ │ Memcached) │ │ Kafka) │
└───────────────┘ └─────────────────┘ └─────────────────┘

1. Client → Load Balancer (Network Layer)

  • TLS handshake occurs (1-2 round trips for TLS 1.2, 1 for TLS 1.3, 0 for resumed sessions)
  • The load balancer distributes traffic using algorithms: round-robin, least-connections, IP-hash, or weighted
  • L4 balancers (TCP level) are faster but less intelligent; L7 balancers (HTTP level) can inspect headers, route by URL path, and terminate TLS

2. Load Balancer → API Gateway

  • Authentication verification (JWT validation, API key lookup)
  • Rate limiting enforcement (token bucket, sliding window)
  • Request transformation and routing
  • The gateway is the first line of defense — it should reject bad requests before they consume application resources

3. API Gateway → Application Server

  • Middleware pipeline executes sequentially
  • Each middleware can short-circuit the request (e.g., auth failure returns 401 immediately)
  • Business logic executes within a request context that carries user identity, permissions, and request metadata

4. Application Server → Data Layer

  • The ORM or query builder translates business operations into SQL
  • Connection pooling manages a finite set of database connections shared across requests
  • Cache checks happen before database queries (cache-aside pattern)
  • Write operations may be queued for asynchronous processing

The database is the single most critical component in most backends. It is the source of truth, the bottleneck, and the attack surface all at once.

Latency Hierarchy
┌─────────────────────────────────────────┐
│ L1 Cache Access ~1 ns │ ← CPU register
│ L2 Cache Access ~4 ns │
│ L3 Cache Access ~12 ns │
│ RAM Access ~100 ns │ ← Application memory
│ SSD Random Read ~16 μs │ ← Database storage
│ SSD Sequential Read ~1 ms/MB │
│ HDD Random Read ~2-10 ms │ ← Legacy databases
│ Network Round Trip ~0.5 ms │ ← DB on same network
│ Cross-region Network ~30-100 ms │ ← Distributed DB
└─────────────────────────────────────────┘
1 ms = 1,000 μs = 1,000,000 ns

A single database query involves:

  1. Network hop from application to database server (~0.5ms on same VPC)
  2. Query parsing — the SQL text is lexed and parsed into an AST
  3. Query planning — the optimizer evaluates possible execution plans (which index to use, join order, etc.)
  4. Execution — the plan is executed, involving disk I/O or buffer pool reads
  5. Result serialization — rows are serialized and sent back over the network

A simple query might take 1-5ms. A complex join across millions of rows without proper indexing can take seconds or minutes.

Server 1 ──┐
Server 2 ──┼──▶ DATABASE (single source of truth)
Server 3 ──┘
Every server reads and writes to the same database.
This makes the database the scalability chokepoint.

You can horizontally scale application servers trivially — just add more instances behind the load balancer. But you cannot trivially scale a relational database. This asymmetry is the root cause of most backend scaling challenges.


Security is not a layer — it is a property that must exist at every layer.

Attack Surface Map
═══════════════════════════════════════════════════════════
LAYER ATTACK VECTORS
─────────────────────────────────────────────────────────
Client/Network Man-in-the-middle, TLS downgrade,
DNS poisoning, DDoS
Load Balancer Slowloris, HTTP flood,
header manipulation
API Gateway Brute force auth, token theft,
rate limit bypass
Application SQL injection, XSS, CSRF, SSRF,
business logic flaws, insecure
deserialization, path traversal
Database SQL injection (escaped from app),
privilege escalation, data exfil,
unencrypted backups
Infrastructure Container escape, misconfigured
IAM, exposed secrets, unpatched
CVEs
═══════════════════════════════════════════════════════════

No single security measure is sufficient. Security is layered:

┌─────────────────────────────────────────────┐
│ Network Security (Firewalls, VPC, TLS) │
│ ┌─────────────────────────────────────────┐│
│ │ Application Security (Auth, Validation)││
│ │ ┌─────────────────────────────────────┐││
│ │ │ Data Security (Encryption, Access) │││
│ │ │ ┌─────────────────────────────────┐│││
│ │ │ │ Monitoring (Audit, Alerting) ││││
│ │ │ └─────────────────────────────────┘│││
│ │ └─────────────────────────────────────┘││
│ └─────────────────────────────────────────┘│
└─────────────────────────────────────────────┘

If the firewall is breached, the application layer should still prevent unauthorized access. If the application has a bug, the database’s row-level security should limit damage. If data is stolen, encryption should make it useless without the keys.


Performance problems in backends follow the 80/20 rule: 80% of latency comes from 20% of the code paths, and almost always that 20% involves the database.

┌────────────────────────────────────────────────────────┐
│ │
│ 1. DATABASE QUERIES (most common) │
│ - Missing indexes → full table scan │
│ - N+1 queries → 100 queries instead of 1 │
│ - Lock contention → queries waiting on each other │
│ - Large result sets → transferring too much data │
│ │
│ 2. NETWORK I/O │
│ - Chatty services → too many microservice calls │
│ - No connection pooling → new TCP handshake each │
│ request │
│ - Synchronous external API calls in request path │
│ │
│ 3. MEMORY │
│ - ORM loading entire object graphs into memory │
│ - Unbounded caches growing until OOM │
│ - Memory leaks from unclosed connections/streams │
│ │
│ 4. CPU │
│ - Expensive serialization (JSON parsing large │
│ payloads) │
│ - Cryptographic operations on the hot path │
│ - Regex backtracking (ReDoS) │
│ │
│ 5. CONCURRENCY │
│ - Thread pool exhaustion │
│ - Connection pool exhaustion │
│ - Deadlocks │
│ - Lock convoy (all threads waiting for one lock) │
│ │
└────────────────────────────────────────────────────────┘

Attacks target the weakest point in your system. Here’s the attack taxonomy by frequency and severity:

HIGH FREQUENCY
│ SQL Injection ● Credential Stuffing ●
│ XSS ● DDoS ●
│ Broken Auth ● CSRF ●
│ Insecure APIs ●
│ SSRF ● Supply Chain ●
│ Deserialization ● Container Escape ●
LOW FREQUENCY
└──────────────────────────────────────────────▶
LOW IMPACT HIGH IMPACT
Reconnaissance → Scanning your APIs, finding endpoints
Initial Access → SQL injection, stolen credentials, SSRF
Privilege Escalation → Accessing admin routes, escalating DB roles
Lateral Movement → Moving from app server to DB server to backups
Data Exfiltration → Dumping the database, stealing secrets
Persistence → Backdoor accounts, cron jobs, modified code

Not all optimization is equal. You must optimize the critical path — the sequence of operations that determines your response time.

HIGHEST IMPACT
═══════════════════════════════════════════
1. Architecture decisions
- Monolith vs. microservices
- Database choice (SQL vs. NoSQL)
- Sync vs. async processing
- Caching strategy
───────────────────────────────────────────
2. Database design
- Schema normalization level
- Index strategy
- Query patterns
- Partitioning scheme
───────────────────────────────────────────
3. Algorithm & data structure choices
- O(n²) vs O(n log n) in hot paths
- Hash maps vs. sorted sets
- Streaming vs. buffering
───────────────────────────────────────────
4. I/O optimization
- Connection pooling
- Batch operations
- Async I/O
- Compression
───────────────────────────────────────────
5. Code-level micro-optimization
- Usually the wrong place to start
- Only after profiling proves it matters
═══════════════════════════════════════════
LOWEST IMPACT

Measure first. Optimize second. Measure again.

Premature optimization based on intuition wastes time and adds complexity. Profile your system under realistic load. Find the actual bottleneck. Fix that one thing. Measure again. Repeat.


LOW LOAD (10 req/s)
════════════════════════════════════════
Everything is fast. Queries return in 2ms.
No contention. No cache needed.
Security checks are negligible overhead.
MEDIUM LOAD (1,000 req/s)
════════════════════════════════════════
Database connections become scarce.
Connection pooling becomes essential.
Cache hit ratio determines latency.
Slow queries start blocking others.
Rate limiting becomes necessary.
HIGH LOAD (100,000 req/s)
════════════════════════════════════════
Database is the bottleneck.
Read replicas are mandatory.
Caching is the only thing keeping you alive.
Connection pools are maxed out.
Every middleware adds measurable latency.
Security checks must be O(1) not O(n).
Sharding or partitioning is required.
EXTREME LOAD (1,000,000+ req/s)
════════════════════════════════════════
Multiple database clusters.
Multi-layer caching (L1 in-process, L2 Redis).
Event-driven architecture replaces request-response.
Write operations are queued and batched.
Security is enforced at the edge (CDN/gateway level).
Every microsecond of latency costs money.

When a senior backend engineer looks at a system, they think in terms of:

  1. Data flow — Where does data enter, transform, store, and exit?
  2. Failure modes — What happens when each component fails?
  3. Bottlenecks — Where will the system break under 10x load?
  4. Attack surface — Where can an attacker inject, intercept, or escalate?
  5. Consistency guarantees — What happens during concurrent writes?
  6. Operational complexity — Can the team actually run this at 3 AM?

This mental model guides every decision from choosing a database to designing an API endpoint. Keep it in mind as you proceed through the rest of this handbook.