System Design Deep Dive: Building Instagram
Prepare for system design interview - Design Instagram
In a world driven by visuals, Instagram stands out. With over 2 billion monthly active users sharing billions of photos and videos daily, it's a platform that defines visual social networking.
But what complex engineering lies beneath the simple scroll of your feed or the tap to upload a story? How do you design a system that handles petabytes of media, generates personalized feeds for billions, and delivers it all with minimal latency?
In this deep dive, we'll explore the high-level system design of building a platform like Instagram, focusing on the core components and challenges crucial for system design interviews.
Inside This Issue: 📸
✅ Requirements: Defining the Instagram Challenge
📊 Scale: Estimating Capacity Needs
🏙️ Architecture: High-Level System Blueprint
💾 Databases: Storing Billions of Posts & Relations
📰 Deep Dive: Generating the Feed
🖼️ Deep Dive: Handling Media Uploads & Storage
💬 Deep Dive: Building Direct Messaging
🚦 Scaling & Reliability: Tackling Bottlenecks
🔌 APIs: Designing Service Interfaces
⭐ Interview Prep: Key Questions & Diagrams
1. Requirement Gathering
Let's start by outlining the core requirements for an Instagram-like service.
1.1 Functional Requirements
User Profiles: Users can create profiles, upload profile pictures, write bios, and manage settings.
Media Upload: Users can upload photos and videos (posts, stories, reels).
Feed: Users see a personalized feed of posts from people they follow. There are separate feeds for Stories and Reels.
Follow/Unfollow: Users can follow and unfollow other users.
Engagement: Users can like and comment on posts.
Direct Messaging: Support 1:1 and group chats (similar to the WhatsApp design).
Search/Discovery: Users can search for other users, hashtags, and locations. A discovery/explore feed suggests content.
Notifications: Notify users about likes, comments, new followers, DMs, etc.
1.2 Non-functional Requirements
High Availability: The service must be highly available; downtime is unacceptable.
Scalability: Must scale to billions of users and handle massive amounts of content uploads and reads.
Low Latency: Feed loading, image/video viewing, and interactions should feel near instantaneous. (Read-heavy operations need optimization).
Durability: Uploaded media must never be lost.
Consistency: Eventual consistency is acceptable for many features (e.g., like counts, follower lists, feed updates), but strong consistency might be needed for user registration, profile updates, or financial transactions (ads).
2. Capacity Estimation (Illustrative)
Let's make some rough assumptions:
Total Users: 2 Billion
Daily Active Users (DAU): 1 Billion (50% of total)
Media Uploads per DAU: Assume 1 user uploads 1 piece of media (photo/video mix) every 2 days on average => 500 million uploads/day.
Average Media Size: Let's average photo (5MB) and video (25MB) => ~10 MB per upload.
Daily Storage Ingest: 500 million uploads/day * 10 MB/upload = 5 PB/day.
Note: This doesn't include replicas, thumbnails, or transcoded video versions, which could easily be 3- 5x this.
Feed Reads per DAU: Assume a user refreshes their feed 20 times a day => 1 Billion DAU * 20 = 20 Billion feed loads/day. Peak load could be much higher. QPS (Queries Per Second) for feed service at peak: ~20 Billion / (24*3600s) * (Peak Factor, e.g., 3x average) ≈ 700k QPS.
Bandwidth (Egress - Reads): Dominated by media delivery. If each feed load involves viewing ~5 new media items, and the average size is ~2MB (mix of thumbnails/previews/full loads): 20 billion feed loads * 5 items * 2 MB = 200 PB/day egress traffic. This requires massive CDN capacity.
Key Takeaway: Instagram is extremely read-heavy (feed scrolling, media viewing) and storage-intensive.