System Design
Photo by Terry Vlisidis
Design Twitter
A simple twitter design which addresses the system design key concepts.
- Requirements and goals
- Network capacity and Storage
- System APIS
- High level design
- Database Schema
- Data Shading
- Cache
Requirements and goals
Functional goals
- Post tweet
- Favorite tweet
- Follow user
- Timeline
Non-Functional Goals
- Highly available
- Low latency - fast loading
- Consistency
Storage and network capacity
let's say we have:
- 1 billion users
- 200M Daily active users
- 100M tweets/day
Storage
We can allow users to tweet a text of maximum length 200 chars.
Doing the math, 100M x (2bytes( 1 char) x 200 + 40bytes meta info) = 44 Billion bytes(~44GB) per day
This means in a year we would need ~(44 x 365) GB = ~16TB data storage
Network capacity
let tweet_size = 440 bytes
If we get about 10 Billion tweet views/day, per second we would need to transfer data at 10B * tweet_size / (24 x 60 x 60) sec = ~50Mbs/s network capacity
System APIs
We can use either SOAP or REST to expose our services. We are going to use REST because it offers flexible implementation, it is lightweight and has a very low learning curve.
POST: tweet(key, tweet_data, location) return url of the new tweet
GET: tweet(key, tweet_id) return info about the tweet in JSON
The key allows us to know who is accessing our services
High level design
We can have a client > load balancer > server cluster > database/file storage
Database Schema
tweet: id userId lat lon createdAt numOfFavorites
user: id name email dob lastLogin
userFollow: follower following
Data Sharding
We can split data between different machines
When App server sends request, we can use a balancer or hash function to know where to get the requested data
server > balancer / hash function > A-N storage or O-Z storage
Caching
We can use LRU(least recently used) cache to handle caching.
With this we can check if the requested tweet is in the cache before we query the database.