Interview Prep

System Design Interviews: How to Think Big

Design Twitter, build a URL shortener, scale Netflix — how to approach the open-ended questions that terrify candidates.

Apr 22, 20265 min listen5 chapters

What you'll learn

The framework: requirements, estimation, high-level design, deep dive
Key concepts: load balancing, caching, database sharding, CDNs
How to communicate trade-offs and justify decisions
Common mistakes and what interviewers actually evaluate

1. The system design interview is a decision-making test

note

System Design Interviews: How to Think Big

Design Twitter, build a URL shortener, scale Netflix — how to approach the open-ended questions that terrify candidates.

note

What interviewers are really grading

They are not expecting you to invent a perfect architecture in 45 minutes. They are checking whether you can:

Clarify ambiguous requirements
Estimate load with reasonable numbers
Build a high-level architecture that fits the scale
Dive deeper where the risk is highest
Explain trade-offs clearly

The four-part framework

Requirements
Estimation
High-level design
Deep dive

A useful mental model

Treat the interview like designing a city transit system. You first ask who needs to travel, how often, and where. Then you choose buses, trains, or both. Only after that do you worry about the exact schedule.

diagram

note

Good questions to ask early

Who are the users?
What are the top three features?
What latency target matters most?
What is the expected read to write ratio?
What failure is acceptable?

Example: URL shortener

If the only goal is redirecting short links, the design is very different from a shortener that also tracks every click in real time and supports custom branded domains.

chart · bar

Example scale estimates

2. Estimation turns vague ideas into concrete constraints

note

Estimation cheat sheet

Use simple math:

Daily requests = users × actions per user per day
Average requests per second = daily requests ÷ 86,400
Peak traffic is often 3x to 10x the average

Why estimation matters

It tells you whether you need:

One database or many
In-memory caching
Asynchronous processing
A content delivery network
Queue-based fanout

Analogy

Estimation is like checking the width of a river before choosing a bridge. You do not design the bridge first. You measure the span, then pick the structure that can actually hold it.

equation

\text{Average RPS} = \frac{10{,}000{,}000 \times 2 \times 20}{86{,}400} \approx 4{,}630

diagram

chart · line

Traffic pattern across a day

3. High-level design is about finding the simplest system that can grow

note

Core building blocks

Load balancer

Spreads requests across servers. Common algorithms include round robin, least connections, and weighted routing.

Cache

Stores hot data in memory. Examples: Redis, Memcached.

Database

Stores durable records. Choose based on access pattern, consistency needs, and scale.

Sharding

Splits one logical dataset into multiple physical databases.

CDN

Caches static content at edge locations close to users.

Trade-off snapshot

Caching improves latency but adds invalidation complexity
Sharding increases scale but makes queries and joins harder
CDNs reduce origin load but are best for static or semi-static content

diagram

illustration

note

Example: Twitter timeline

A common design is write once, read many. When a user posts a tweet, the system stores it and then makes it available to followers. For a small follower graph, you can fan out on write. For a celebrity account with tens of millions of followers, pushing the tweet into every follower feed immediately becomes expensive. That is where hybrid approaches matter.

4. Deep dives are where you earn trust

note

Common deep-dive topics

URL shortener

Key generation
Collision handling
Redirect latency
Analytics pipeline

Twitter

Feed fanout strategy
Ranking versus chronological order
Hot users and celebrity accounts
Follow graph storage

Netflix

Video chunking
CDN placement
Adaptive bitrate streaming
Buffering and retry behavior

Trade-offs interviewers like

Strong consistency versus availability
Fanout on write versus fanout on read
SQL versus NoSQL
Cache freshness versus latency
Simplicity versus scale

diagram

equation

\text{If 1\% of users generate 50\% of traffic, optimize the hot path first.}

note

A concrete example of trade-off language

Instead of saying, "I would use caching because it is faster," say:

"I would cache tweet metadata in Redis because timeline reads are far more frequent than writes. That reduces database pressure. The cost is cache invalidation after edits or deletes, so I would keep a short TTL and use write-through updates for critical fields."

That answer shows intent, mechanism, and downside.

5. What strong candidates do differently

note

Common mistakes

Starting with tools instead of requirements
Forgetting scale estimates
Designing for every edge case at once
Using microservices too early
Not discussing failure modes
Ignoring cost and operational complexity

What strong answers sound like

"Given this traffic estimate, a cache will remove pressure from the database."
"This data can be eventually consistent because the user experience allows slight delay."
"The bottleneck will be celebrity accounts, so I would handle them differently."

Final mental checklist

Requirements. Numbers. Architecture. Deep dive. Trade-offs. Failure modes.

diagram

chart · pie

Interview time allocation

Transcript

Welcome to Slate. Today we're looking at System Design Interviews: How to Think Big. We'll cover The framework: requirements, estimation, high-level design, deep dive, Key concepts: load balancing, caching, database sharding, CDNs, How to communicate trade-offs and justify decisions, and Common mistakes and what interviewers actually evaluate. Let's get into it.

A system design interview is not a trivia quiz. It is a test of how you turn a vague product idea into a working plan. The interviewer is watching for structure, not perfection. Think of it like sketching a building before pouring concrete. You do not start with the windows. You start with the load-bearing walls. The first move is requirements. Ask what the product must do, and just as important, what it must not do. For a URL shortener, do we need custom aliases? Expiration? Click analytics? For Twitter, do we need search, private accounts, direct messages, or only the public timeline? Then estimate scale. A back-of-the-envelope estimate is not a math contest. It is how you choose the right tools. If a service handles 10 requests per second, a simple database may be enough. If it handles 10,000 per second, every choice changes. Interviewers also want to see communication. Say your assumptions out loud. If you choose one option over another, explain the trade-off. A strong answer sounds like a sequence of decisions, each justified by the requirements and the numbers.

Estimation is where the interview stops being abstract. You are translating product shape into engineering pressure. Here is the key idea: traffic determines architecture. A system that serves a few million requests a day can use very different components from one serving billions. Suppose a feature gets 10 million daily active users, and each user opens the app twice a day. If each session creates 20 timeline reads, that is 400 million reads per day. Divide by 86,400 seconds, and the average is about 4,630 reads per second. Real traffic is spiky, so peak load may be several times higher. Now the design choices start to make sense. A cache can absorb repeated reads. A CDN, or Content Delivery Network, can move static media closer to users. Sharding can split a database across machines when one server is no longer enough. Do not chase perfect precision. The goal is to be directionally correct. If your estimate is off by 20 percent, that is fine. If you miss by 100x, your architecture will be wrong. The interviewer wants to see whether your numbers lead to sensible decisions.

Once the scale is visible, you can sketch the system. High-level design is not a pile of boxes. It is a chain of responsibilities. Requests come in, something routes them, something stores them, something speeds up repeated access, and something handles expensive work later. Load balancing is the front door. It spreads traffic across multiple servers so no single machine becomes a bottleneck. A cache is the shortcut path. If the same data is requested often, serving it from memory is much faster than asking the database every time. A database stores durable state. If one database cannot keep up, sharding splits the data by key, such as user ID or tweet ID, across multiple machines. For media-heavy systems like Netflix, a CDN matters because video files are large and users are geographically distributed. Instead of pulling every movie from one origin server, the CDN keeps copies near viewers. That lowers latency and protects the core servers. The best high-level design is usually boring in a good way. It uses familiar pieces in a way that matches the load and the product needs.

The deep dive is your chance to show engineering judgment. Pick the riskiest part of the system and go one level deeper. For a URL shortener, that might be key generation and redirect latency. For Twitter, it might be feed generation. For Netflix, it might be video delivery and adaptive bitrate streaming. This is where trade-offs become concrete. A cache speeds reads, but you must decide when data becomes stale. A sharded database scales writes, but cross-shard queries get harder. A CDN reduces latency, but it is not a replacement for the origin system. Each choice solves one problem and creates another. Say the trade-off out loud. If you choose eventual consistency for a feed, explain why that is acceptable. Users do not need every follower to see a post in the same millisecond. If you choose strong consistency for a payment record, explain why correctness matters more than speed. Interviewers like hearing the phrase, here is the bottleneck. That shows priority. You are not trying to optimize everything. You are identifying the one thing that will break first, then fixing that first.

Strong candidates do three things consistently. They keep the conversation structured. They connect every design choice back to a requirement or a number. And they know when to stop zooming in and move on. Many candidates make the same mistakes. They jump straight to microservices before defining the product. They name technologies without explaining why they fit. They ignore bottlenecks like hot keys, cache invalidation, or database rebalancing. They also forget failure. A real system has outages, retries, and partial data loss. If you mention those early, you sound like someone who has built things. What interviewers actually evaluate is not whether your answer matches theirs. They evaluate whether your thinking is coherent under uncertainty. Can you ask good questions? Can you make reasonable assumptions? Can you defend a design and revise it when new constraints appear? Think of the interview as a guided design review. The goal is not to impress with complexity. The goal is to show that you can build a system that works, explain why it works, and admit where it might break.

X LinkedIn WhatsApp

Keep going with Slate

Pick up where this left off in your own voice session.

Built with Slate