System Design Interviews: How to Think Big
Design Twitter, build a URL shortener, scale Netflix — how to approach the open-ended questions that terrify candidates.
- The framework: requirements, estimation, high-level design, deep dive
- Key concepts: load balancing, caching, database sharding, CDNs
- How to communicate trade-offs and justify decisions
- Common mistakes and what interviewers actually evaluate
1. The system design interview is a decision-making test
System Design Interviews: How to Think Big
Design Twitter, build a URL shortener, scale Netflix — how to approach the open-ended questions that terrify candidates.
What interviewers are really grading
They are not expecting you to invent a perfect architecture in 45 minutes. They are checking whether you can:
- Clarify ambiguous requirements
- Estimate load with reasonable numbers
- Build a high-level architecture that fits the scale
- Dive deeper where the risk is highest
- Explain trade-offs clearly
The four-part framework
- Requirements
- Estimation
- High-level design
- Deep dive
A useful mental model
Treat the interview like designing a city transit system. You first ask who needs to travel, how often, and where. Then you choose buses, trains, or both. Only after that do you worry about the exact schedule.
Good questions to ask early
- Who are the users?
- What are the top three features?
- What latency target matters most?
- What is the expected read to write ratio?
- What failure is acceptable?
Example: URL shortener
If the only goal is redirecting short links, the design is very different from a shortener that also tracks every click in real time and supports custom branded domains.
2. Estimation turns vague ideas into concrete constraints
Estimation cheat sheet
Use simple math:
- Daily requests = users × actions per user per day
- Average requests per second = daily requests ÷ 86,400
- Peak traffic is often 3x to 10x the average
Why estimation matters
It tells you whether you need:
- One database or many
- In-memory caching
- Asynchronous processing
- A content delivery network
- Queue-based fanout
Analogy
Estimation is like checking the width of a river before choosing a bridge. You do not design the bridge first. You measure the span, then pick the structure that can actually hold it.
3. High-level design is about finding the simplest system that can grow
Core building blocks
Load balancer
Spreads requests across servers. Common algorithms include round robin, least connections, and weighted routing.
Cache
Stores hot data in memory. Examples: Redis, Memcached.
Database
Stores durable records. Choose based on access pattern, consistency needs, and scale.
Sharding
Splits one logical dataset into multiple physical databases.
CDN
Caches static content at edge locations close to users.
Trade-off snapshot
- Caching improves latency but adds invalidation complexity
- Sharding increases scale but makes queries and joins harder
- CDNs reduce origin load but are best for static or semi-static content

Example: Twitter timeline
A common design is write once, read many. When a user posts a tweet, the system stores it and then makes it available to followers. For a small follower graph, you can fan out on write. For a celebrity account with tens of millions of followers, pushing the tweet into every follower feed immediately becomes expensive. That is where hybrid approaches matter.
4. Deep dives are where you earn trust
Common deep-dive topics
URL shortener
- Key generation
- Collision handling
- Redirect latency
- Analytics pipeline
- Feed fanout strategy
- Ranking versus chronological order
- Hot users and celebrity accounts
- Follow graph storage
Netflix
- Video chunking
- CDN placement
- Adaptive bitrate streaming
- Buffering and retry behavior
Trade-offs interviewers like
- Strong consistency versus availability
- Fanout on write versus fanout on read
- SQL versus NoSQL
- Cache freshness versus latency
- Simplicity versus scale
A concrete example of trade-off language
Instead of saying, "I would use caching because it is faster," say:
"I would cache tweet metadata in Redis because timeline reads are far more frequent than writes. That reduces database pressure. The cost is cache invalidation after edits or deletes, so I would keep a short TTL and use write-through updates for critical fields."
That answer shows intent, mechanism, and downside.
5. What strong candidates do differently
Common mistakes
- Starting with tools instead of requirements
- Forgetting scale estimates
- Designing for every edge case at once
- Using microservices too early
- Not discussing failure modes
- Ignoring cost and operational complexity
What strong answers sound like
- "Given this traffic estimate, a cache will remove pressure from the database."
- "This data can be eventually consistent because the user experience allows slight delay."
- "The bottleneck will be celebrity accounts, so I would handle them differently."
Final mental checklist
Requirements. Numbers. Architecture. Deep dive. Trade-offs. Failure modes.
Keep going with Slate
Pick up where this left off in your own voice session.