This site runs best with JavaScript enabled.System Design

System Design


Design Twitter

“Logic will get you from A to Z; imagination will get you everywhere.” ― Albert Einstein

We are going to do a simple twitter design by addressing the following.

  1. Requirements and goals
  2. Network capacity and Storage
  3. System APIS
  4. High level design
  5. Database Schema
  6. Data Shading
  7. Cache

Requirements and goals

Functional goals

  • Post tweet
  • Favorite tweet
  • Follow user
  • Timeline

Non-Functional Goals

  • Highly available
  • Low latency - fast loading
  • Consistency

Storage and network capacity

let's say we have:

  • 1 billion users
  • 200M Daily active users
  • 100M tweets/day

Storage

We can allow users to tweet a text of maximum length 200 chars.

Doing the math, 100M x (2bytes( 1 char) x 200 + 40bytes meta info) = 44 Billion bytes(~44GB) per day

This means in a year we would need ~(44 x 365) GB = ~16TB data storage

Network capacity

let tweet_size = 440 bytes

If we get about 10 Billion tweet views/day, per second we would need to transfer data at 10B * tweet_size / (24 x 60 x 60) sec = ~50Mbs/s network capacity

System APIs

We can use either SOAP or REST to expose our services. We are going to use REST because it offers flexible implementation, it is lightweight and has a very low learning curve.

POST: tweet(key, tweet_data, location) return url of the new tweet

GET: tweet(key, tweet_id) return info about the tweet in JSON

The key allows us to know who is accessing our services

High level design

We can have a client > load balancer > server cluster > database/file storage image

Database Schema

tweet: id userId lat lon createdAt numOfFavorites

user: id name email dob lastLogin

userFollow: follower following

Data Sharding

We can split data between different machines

When App server sends request, we can use a balancer or hash function to know where to get the requested data

server > balancer / hash function > A-N storage or O-Z storage

Caching

We can use LRU(least recently used) cache to handle caching.

With this we can check if the requested tweet is in the cache before we query the database.

Share article
James Chege

James Chege is a software developer in Nairobi