Chapter 8 — Expert-Level Engineering Notes

8.1 How Senior Engineers Think About Performance

The Performance Thinking Framework

    AMATEUR APPROACH:
    "This endpoint is slow. Let me add an index."
    → Might fix it. Might not. Might make writes slower.

    EXPERT APPROACH:
    1. What is the actual latency? (measure with p50, p95, p99)
    2. What is the acceptable latency? (SLO: p99 < 200ms)
    3. Where is time being spent? (profile: DB? network? CPU? I/O?)
    4. What is the theoretical minimum? (can't be faster than network RTT)
    5. What is the cheapest change that closes the gap?
    6. What are the side effects of that change?
    7. Implement. Measure. Verify.

The Hierarchy of Performance Interventions

    MOST IMPACT, LEAST EFFORT (do these first)
    ═══════════════════════════════════════════════════════
    │
    │  1. Remove unnecessary work
    │     - Are you fetching data you don't use? (SELECT *)
    │     - Are you making calls you don't need? (N+1)
    │     - Are you computing things that don't change? (cache it)
    │     - Are you doing synchronously what could be async?
    │
    │  2. Cache the result
    │     - Can you serve a cached response? (Redis, CDN)
    │     - Can you memoize an expensive computation?
    │     - Can you precompute during off-peak hours?
    │
    │  3. Batch operations
    │     - Can you send 1 query instead of 100?
    │     - Can you write in bulk instead of one at a time?
    │     - Can you use pipelines (Redis PIPELINE)?
    │
    │  4. Add an index
    │     - But verify with EXPLAIN first
    │     - Consider the write amplification cost
    │     - Monitor index usage after creation
    │
    │  5. Rewrite the query
    │     - Different join strategy
    │     - Materialize a view
    │     - Denormalize the schema
    │
    │  6. Scale the infrastructure
    │     - Add read replicas
    │     - Increase connection pool
    │     - Larger instance size
    │     - This is the LAST resort, not the first
    │
    LEAST IMPACT, MOST EFFORT (do these last)

The “Thundering Herd” Mental Model

Senior engineers always ask: “What happens when 1000 users do this simultaneously?“

    Single user:
    Request → Cache miss → DB query (5ms) → Cache set → Response
    Perfectly fine.

    1000 users simultaneously:
    1000 requests → 1000 cache misses → 1000 DB queries → DB overload
    This is the thundering herd problem.

    Expert thinking:
    "Every cache miss is a potential thundering herd.
     Every write operation is a potential lock contention.
     Every external call is a potential cascade failure.
     Design for the concurrent case, not the single-user case."

8.2 How Senior Engineers Think About Security

The Threat Modeling Framework

    STRIDE Model (Microsoft):
    ═══════════════════════════════════════════════════════

    S — Spoofing (impersonating another user/service)
        Can someone pretend to be a different user?
        → Strong authentication, mutual TLS

    T — Tampering (modifying data in transit/at rest)
        Can someone modify data without detection?
        → Integrity checks, digital signatures, HMAC

    R — Repudiation (denying an action was performed)
        Can someone deny they performed an action?
        → Audit logging, non-repudiation via signatures

    I — Information Disclosure (unauthorized data access)
        Can someone see data they shouldn't?
        → Encryption, access control, output sanitization

    D — Denial of Service (preventing legitimate access)
        Can someone make the system unavailable?
        → Rate limiting, DDoS protection, resource quotas

    E — Elevation of Privilege (gaining unauthorized access)
        Can someone escalate from user to admin?
        → Least privilege, authorization checks on every request

Security Thinking Patterns

    PATTERN 1: "Trust nothing, verify everything"
    ═══════════════════════════════════════════════════════

    ✗ Amateur: "The frontend validates the input, so the backend is safe"
    ✓ Expert: "The frontend is untrusted. Validate everything on the server.
               Assume every input is an attack until proven otherwise."

    ✗ Amateur: "The JWT is signed, so the claims must be correct"
    ✓ Expert: "The JWT signature proves it wasn't modified,
               but the issuer might have made a mistake.
               Check expiry. Check audience. Check permissions in your own system."


    PATTERN 2: "What's the blast radius?"
    ═══════════════════════════════════════════════════════

    Before granting any permission, ask:
    "If this credential is compromised, what can the attacker do?"

    If the answer is "everything" → you've violated least privilege.
    Each component should have just enough access to do its job.

    ┌────────────────────────────────────────┐
    │  API key scoped to: read:users         │  Blast radius: can read users
    │  API key scoped to: *                  │  Blast radius: EVERYTHING
    │                                        │
    │  DB user: SELECT on orders             │  Blast radius: can read orders
    │  DB user: superuser                    │  Blast radius: can drop database
    └────────────────────────────────────────┘


    PATTERN 3: "Defense in depth"
    ═══════════════════════════════════════════════════════

    Don't rely on a single security control. Layer them:

    Layer 1: WAF blocks known attack patterns → stops 80%
    Layer 2: Input validation rejects bad input → stops 15%
    Layer 3: Parameterized queries prevent injection → stops 4.9%
    Layer 4: DB user has minimal permissions → limits remaining 0.1%
    Layer 5: Encryption at rest → stolen data is useless
    Layer 6: Audit logs → detect what slipped through

    Any single layer might fail. All six failing is extremely unlikely.


    PATTERN 4: "Assume breach"
    ═══════════════════════════════════════════════════════

    Don't just try to prevent attacks. Plan for when they succeed.

    ✗ "We have a firewall, so we're safe"
    ✓ "When the attacker gets through the firewall, what limits the damage?"

    This leads to:
    - Network segmentation (breach one service, not all)
    - Encrypted sensitive data (stolen data is useless)
    - Short-lived credentials (stolen tokens expire quickly)
    - Monitoring and alerting (detect breach within minutes, not months)
    - Incident response plan (practiced, not just documented)

8.3 How Senior Engineers Think About Scaling

The Scaling Decision Framework

    QUESTION 1: Do you ACTUALLY need to scale?
    ═══════════════════════════════════════════════════════

    Before adding complexity:
    - Have you optimized your queries? (A well-indexed query on
      a single PostgreSQL instance handles 10,000+ queries/second)
    - Have you added caching? (Redis serves 100,000+ ops/second)
    - Have you fixed your N+1 queries?
    - Are you fetching data you don't need?

    A single well-tuned PostgreSQL instance with 64GB RAM can handle
    most applications up to 10K requests/second. That's enough for
    most companies in the world.

    Premature scaling adds:
    - Operational complexity
    - Data consistency challenges
    - Higher costs
    - More failure modes
    - Debugging difficulty


    QUESTION 2: Can you scale vertically first?
    ═══════════════════════════════════════════════════════

    Vertical scaling (bigger machine):
    - Double RAM: $X/month
    - Faster CPU: $Y/month
    - NVMe SSD: $Z/month
    - Time to implement: 30 minutes (resize instance)

    Horizontal scaling (more machines):
    - Read replicas: weeks of engineering
    - Sharding: months of engineering
    - Application changes for consistency
    - Time to implement: weeks to months

    Vertical scaling is almost always cheaper and simpler
    up to a certain point. Only go horizontal when you hit
    the limits of a single machine.

    Modern cloud instances go up to:
    - 24 TB RAM (AWS u-24tb1)
    - 448 vCPUs
    - 60+ TB NVMe storage

    You probably don't need sharding.


    QUESTION 3: What dimension are you scaling?
    ═══════════════════════════════════════════════════════

    ┌─────────────────────┬──────────────────────────────────────┐
    │  Scaling Dimension  │  Solutions                           │
    ├─────────────────────┼──────────────────────────────────────┤
    │  Read throughput     │  Read replicas, caching, CDN        │
    │  Write throughput    │  Sharding, async writes, CQRS       │
    │  Data volume         │  Partitioning, archival, compression│
    │  Compute             │  Horizontal app servers, queues     │
    │  Concurrent users    │  Connection pooling, WebSockets     │
    │  Geographic reach    │  Multi-region, edge computing       │
    └─────────────────────┴──────────────────────────────────────┘

    Different problems have different solutions.
    Don't shard when you need a read replica.
    Don't add servers when you need a cache.

8.4 How Senior Engineers Think About Database Design

Schema Design Principles

    PRINCIPLE 1: Normalize first, denormalize where data proves you should
    ═══════════════════════════════════════════════════════

    Start with 3NF (Third Normal Form):
    - No duplicate data
    - Every non-key column depends only on the primary key
    - Easier to maintain, update, and ensure consistency

    Denormalize when:
    - You have a specific, measured performance problem
    - The query pattern is read-heavy and the join is expensive
    - You're willing to maintain the denormalized copy

    Example:
    Normalized: Order → JOIN User → get user_name
    Denormalized: Order.user_name (copied at order creation time)
    Tradeoff: Faster reads, but user_name changes don't propagate to old orders
    (Might be acceptable: "this was the name at time of order")


    PRINCIPLE 2: Design for your query patterns, not your objects
    ═══════════════════════════════════════════════════════

    ✗ Object-oriented DB design:
    "We have Users, Orders, Products, Reviews.
     One table per class. Normalize everything."

    ✓ Query-oriented DB design:
    "Our dashboard needs: orders per user per month.
     Our search needs: products with avg rating.
     Our checkout needs: user + cart + inventory (atomically).

     What schema makes these queries fast and safe?"

    The schema should serve the access patterns, not the object model.


    PRINCIPLE 3: Use database constraints, not application logic
    ═══════════════════════════════════════════════════════

    ✗ Checking uniqueness in application:
    if not User.query.filter_by(email=email).first():
        db.session.add(User(email=email))
    # Race condition: two requests both pass the check, both insert!

    ✓ Using database UNIQUE constraint:
    CREATE UNIQUE INDEX ON users (email);
    # Database guarantees uniqueness, even under concurrent writes
    # Application catches the unique violation exception

    Database constraints to always use:
    - NOT NULL (on everything that shouldn't be null)
    - UNIQUE (on natural keys: email, username, etc.)
    - FOREIGN KEY (on all references between tables)
    - CHECK (on value ranges: price > 0, status IN (...))
    - DEFAULT (sensible defaults reduce application complexity)

8.5 Tradeoffs Senior Engineers Navigate

Consistency vs. Availability (CAP Theorem in Practice)

    THE REAL QUESTION:
    "When a network partition occurs, do we sacrifice
     consistency (serve stale data) or availability (return errors)?"

    ┌──────────────────────────────────────────────────────────────┐
    │                                                              │
    │  Banking: Choose CONSISTENCY                                 │
    │  "Better to show an error than show wrong balance"           │
    │  → PostgreSQL, strong consistency, synchronous replication    │
    │  → User sees error if partition occurs                       │
    │                                                              │
    │  Social Media Feed: Choose AVAILABILITY                      │
    │  "Better to show slightly stale feed than show error"        │
    │  → Cassandra, eventual consistency, async replication        │
    │  → User sees feed (maybe 30s stale) even during partition   │
    │                                                              │
    │  E-commerce Product Page: Choose AVAILABILITY with limits   │
    │  "Show cached product data, but check inventory at checkout" │
    │  → Cache for reads (available), DB for writes (consistent)  │
    │  → Mix of both strategies                                    │
    │                                                              │
    └──────────────────────────────────────────────────────────────┘

Cost vs. Performance

    ┌──────────────────────────────────────────────────────────────┐
    │                 COST-PERFORMANCE CURVE                        │
    │                                                              │
    │  Performance                                                 │
    │  ▲                                                           │
    │  │                            ╭────── Diminishing returns    │
    │  │                        ╭───╯                              │
    │  │                    ╭───╯                                   │
    │  │                ╭───╯                                       │
    │  │            ╭───╯                                           │
    │  │        ╭───╯                                               │
    │  │    ╭───╯  ← Sweet spot                                    │
    │  │╭───╯                                                       │
    │  └───────────────────────────────────────────────▶ Cost      │
    │                                                              │
    │  The first $100/month in optimization (index, cache) gives   │
    │  10x improvement. The next $10,000/month in scaling gives    │
    │  2x improvement.                                             │
    │                                                              │
    │  Expert thinking:                                            │
    │  "What performance do we NEED? Not the maximum possible,     │
    │   but the minimum that meets our SLO. Then find the          │
    │   cheapest way to get there."                                │
    └──────────────────────────────────────────────────────────────┘

Build vs. Buy

    ┌────────────────────────────────────────────────────────────────┐
    │                                                                │
    │  BUILD when:                        BUY when:                  │
    │  ─────────                          ─────────                  │
    │  Core differentiator                Commodity functionality    │
    │  Unique requirements                Standard solution exists   │
    │  Security-critical + bespoke        Well-maintained SaaS       │
    │  Team has deep expertise            Team lacks expertise       │
    │  Long-term commitment               Need it this quarter       │
    │                                                                │
    │  Examples:                          Examples:                   │
    │  Your core business logic           Auth (Auth0, Clerk)        │
    │  Custom ML pipeline                 Email (SendGrid, SES)      │
    │  Unique data processing             Monitoring (Datadog)       │
    │                                     Search (Algolia)           │
    │                                     Payments (Stripe)          │
    │                                                                │
    │  The cost of building auth from scratch:                       │
    │  6-12 months of engineering + ongoing maintenance +            │
    │  security liability. Auth0 costs $23/month for 1000 users.    │
    │                                                                │
    │  Don't build what isn't your competitive advantage.            │
    └────────────────────────────────────────────────────────────────┘

8.6 Expert Decision-Making Heuristics

    HEURISTIC 1: "Make it work, make it right, make it fast"
    ═══════════════════════════════════════════════════════
    First get the correct behavior. Then clean up the code.
    Then optimize performance — but only if needed.
    Most code never needs the third step.

    HEURISTIC 2: "The best optimization is the one you don't need"
    ═══════════════════════════════════════════════════════
    Before optimizing, ask: can we change the requirements?
    "Can we show this page from cache (1ms) instead of
     computing it live (500ms)? Is 5-second-old data acceptable?"
    Often, relaxing freshness requirements eliminates the problem entirely.

    HEURISTIC 3: "Simple systems fail in simple ways"
    ═══════════════════════════════════════════════════════
    A monolith with one database has predictable failure modes.
    30 microservices with 15 databases have exponential failure modes.
    Choose the simplest architecture that meets your actual requirements.

    HEURISTIC 4: "Optimize for understanding, not cleverness"
    ═══════════════════════════════════════════════════════
    The person debugging your system at 3 AM should be able to
    understand it within 10 minutes. Clever optimizations that
    save 5ms but require a PhD to understand are usually not worth it.

    HEURISTIC 5: "Measure everything, optimize nothing (until you must)"
    ═══════════════════════════════════════════════════════
    Instrument everything from day one. But don't optimize until
    the metrics show you need to. Intuition about performance is
    wrong more often than it's right.

    HEURISTIC 6: "The database is always the bottleneck, until it isn't"
    ═══════════════════════════════════════════════════════
    Start by assuming the database is slow. It usually is.
    But once you've added caching, read replicas, and proper indexes,
    the bottleneck shifts to the application layer or network.
    Always re-measure after each optimization.