YouTube System Design

Quick Reference: Netflix | Step 4: CDN | Step 5: Video Streaming


Quick Reference

Scale: 2B+ users, 500+ hours uploaded/minute, 1B+ hours watched/day, 80M+ videos

Key Components: Video storage, CDN, transcoding pipeline, recommendations, search, comments, analytics

Challenges: Massive storage, global delivery, real-time transcoding, recommendations at scale, search indexing


Clear Definition

YouTube is the world's largest video sharing platform, handling unprecedented scale: 2B+ users, 500+ hours of video uploaded every minute, and 1B+ hours watched daily. It requires massive distributed storage, global content delivery, real-time video processing, sophisticated recommendation systems, and powerful search capabilities.

πŸ’‘ Key Insight: YouTube leverages Google's infrastructure (Cloud Storage, CDN, BigQuery) for storage and analytics, uses a distributed transcoding pipeline for video processing, and employs advanced ML algorithms for recommendations and search ranking.


System Requirements

Functional Requirements

  1. Video Upload

    • Accept video uploads (up to 256GB, 12 hours)
    • Support multiple formats
    • Progress tracking
    • Resume interrupted uploads
  2. Video Processing

    • Transcode to multiple formats/qualities
    • Generate thumbnails
    • Extract metadata
    • Content moderation
  3. Video Playback

    • Stream videos in multiple qualities
    • Adaptive bitrate streaming
    • Support live streaming
    • Offline downloads (Premium)
  4. User Features

    • User accounts and channels
    • Subscriptions
    • Likes, comments, shares
    • Playlists and watch history
  5. Discovery

    • Personalized recommendations
    • Search functionality
    • Trending videos
    • Related videos

Non-Functional Requirements

  1. Scale

    • 2B+ users globally
    • 500+ hours uploaded/minute
    • 1B+ hours watched/day
    • Handle viral videos (millions of concurrent viewers)
  2. Performance

    • Fast video start (< 2 seconds)
    • Smooth playback
    • Fast search results
    • Real-time recommendations
  3. Availability

    • 99.9% uptime
    • Global availability
    • Handle traffic spikes
  4. Storage

    • Petabytes of video storage
    • Efficient storage (compression)
    • Geographic distribution
    • Redundancy

High-Level Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Client Applications                       β”‚
β”‚  (Web, Mobile, TV, Gaming Consoles)                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β”‚ HTTPS
                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Load Balancer / API Gateway              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚               β”‚               β”‚
        β–Ό               β–Ό               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Upload     β”‚ β”‚   Playback   β”‚ β”‚   Search     β”‚
β”‚   Service    β”‚ β”‚   Service    β”‚ β”‚   Service    β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                β”‚                 β”‚
       β”‚                β”‚                 β”‚
       β–Ό                β–Ό                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Transcoding Pipeline                           β”‚
β”‚  (Video Processing, Thumbnail Generation)                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Google Cloud Storage                            β”‚
β”‚         (Encoded Videos, Thumbnails, Metadata)               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Google CDN                               β”‚
β”‚         (Global Content Delivery)                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

1. Video Upload Service

Responsibilities:

  • Accept video uploads
  • Validate video files
  • Track upload progress
  • Handle resume for interrupted uploads

Upload Flow:

1. User selects video file
   β”‚
2. Client requests upload URL
   β”‚
3. Server generates signed URL (Google Cloud Storage)
   β”‚
4. Client uploads directly to Cloud Storage
   β”‚
5. Server notified when upload complete
   β”‚
6. Video queued for processing

Technologies:

  • Direct Upload: Upload directly to Cloud Storage (bypasses servers)
  • Resumable Uploads: Support resuming interrupted uploads
  • Progress Tracking: Real-time upload progress

Optimizations:

  • Chunked uploads (resume capability)
  • Compression before upload (reduce bandwidth)
  • Parallel uploads (faster for large files)

2. Transcoding Pipeline

What is Transcoding? Converting video from one format to another (e.g., MP4 to HLS) with different qualities and codecs.

Transcoding Pipeline:

Raw Video Upload
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Video          β”‚
β”‚  Validation     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Transcoding    β”‚
β”‚  Queue          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Parallel Transcoding Jobs      β”‚
β”‚  - 480p, 720p, 1080p, 4K       β”‚
β”‚  - Multiple bitrates            β”‚
β”‚  - H.264, VP9, AV1 codecs       β”‚
β”‚  - Generate thumbnails          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Store Encoded  β”‚
β”‚  Videos         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Transcoding Formats:

  • Resolutions: 240p, 360p, 480p, 720p, 1080p, 1440p, 2160p (4K)
  • Bitrates: Multiple bitrates per resolution
  • Codecs: H.264 (compatibility), VP9 (efficiency), AV1 (future)
  • Containers: MP4, WebM
  • Streaming: HLS (HTTP Live Streaming), DASH

Processing Time:

  • Depends on video length and resolution
  • Typically 1-2x video length
  • Parallel processing for multiple formats
  • Priority queue for popular videos

Technologies:

  • Distributed Processing: Multiple transcoding servers
  • Queue System: Google Cloud Tasks or Pub/Sub
  • Storage: Google Cloud Storage
  • Monitoring: Track processing time, failures

3. Video Storage

Storage Architecture:

  • Primary Storage: Google Cloud Storage
  • Replication: Multiple copies for redundancy
  • Geographic Distribution: Store in multiple regions
  • Lifecycle Management: Move old videos to cheaper storage

Storage Optimization:

  • Compression: Efficient codecs (VP9, AV1)
  • Tiered Storage: Hot (frequently accessed) vs cold (rarely accessed)
  • Deduplication: Avoid storing duplicate content

Storage Scale:

  • Petabytes of video data
  • Millions of video files
  • Continuous growth (500+ hours/minute)

4. CDN (Content Delivery Network)

CDN Architecture:

  • Google's Global CDN: Leverages Google's infrastructure
  • Edge Locations: Servers in major cities worldwide
  • Caching: Cache popular videos at edge
  • Routing: Route users to nearest edge location

CDN Strategy:

  • Popular Videos: Pre-cached at edge locations
  • Less Popular: Served from origin, cached on demand
  • Live Streaming: Special handling for live content
  • Geographic Routing: Route based on user location

Benefits:

  • Low Latency: Content served from nearby location
  • High Bandwidth: Handle millions of concurrent streams
  • Cost Effective: Reduce origin server load
  • Scalability: Handle viral videos

5. Playback Service

Responsibilities:

  • Generate playback URLs
  • Manage playback sessions
  • Track watch time
  • Handle quality switching

Playback Flow:

1. User clicks video
   β”‚
2. Client requests playback URL
   β”‚
3. Playback service generates signed URL (CDN)
   β”‚
4. Client starts streaming from CDN
   β”‚
5. Adaptive bitrate: Adjust quality based on bandwidth
   β”‚
6. Track watch time and update history

Adaptive Streaming:

  • Start with lower quality (faster start)
  • Upgrade to higher quality if bandwidth allows
  • Downgrade if bandwidth decreases
  • Smooth transitions between qualities

6. Recommendation System

How it Works:

  1. Data Collection: Track user behavior (watches, likes, comments, search)
  2. Feature Extraction: Extract features from videos and users
  3. Model Training: Train ML models (deep learning, collaborative filtering)
  4. Real-time Inference: Generate recommendations in real-time
  5. A/B Testing: Continuously test and improve

Recommendation Algorithms:

  • Collaborative Filtering: "Users who watched X also watched Y"
  • Content-Based: Recommend similar content
  • Deep Learning: Neural networks for complex patterns
  • Hybrid: Combine multiple approaches

Key Features:

  • Personalization: Recommendations per user
  • Real-time Updates: Update based on recent activity
  • Diversity: Show variety (not just similar videos)
  • Exploration: Balance popular vs niche content

Recommendation Sources:

  • Home page recommendations
  • "Up next" suggestions
  • Related videos
  • Trending videos
  • Subscribed channels

7. Search Service

Search Architecture:

  • Indexing: Index video metadata, transcripts, captions
  • Ranking: Rank results by relevance, popularity, recency
  • Autocomplete: Fast search suggestions
  • Filters: Filter by duration, upload date, etc.

Search Features:

  • Full-Text Search: Search in titles, descriptions, transcripts
  • Video Search: Search within video content (speech-to-text)
  • Filters: Duration, upload date, view count, etc.
  • Sorting: Relevance, upload date, view count, rating

Technologies:

  • Indexing: Google's search infrastructure
  • Ranking: ML-based ranking algorithms
  • Autocomplete: Trie data structure for fast suggestions

8. Comments and Social Features

Comments System:

  • Comments Storage: Store comments with video association
  • Threading: Support comment replies (nested)
  • Moderation: Content moderation (automated + manual)
  • Real-time Updates: Real-time comment updates

Social Features:

  • Subscriptions: Follow channels
  • Likes/Dislikes: Video ratings
  • Shares: Share videos
  • Playlists: Create and manage playlists

Data Flow

Video Upload Flow

1. User uploads video
   β”‚
2. Video uploaded to Cloud Storage (direct upload)
   β”‚
3. Upload service notified
   β”‚
4. Video queued for transcoding
   β”‚
5. Transcoding pipeline processes video
   β”‚
6. Multiple formats generated
   β”‚
7. Encoded videos stored in Cloud Storage
   β”‚
8. Metadata stored in database
   β”‚
9. Video available for playback

Video Playback Flow

1. User clicks video
   β”‚
2. Client requests playback URL
   β”‚
3. Playback service generates CDN URL
   β”‚
4. Client starts streaming from CDN
   β”‚
5. Adaptive bitrate adjusts quality
   β”‚
6. Watch time tracked
   β”‚
7. Recommendations updated

Recommendation Flow

1. User watches video
   β”‚
2. Behavior tracked (watch time, completion, interaction)
   β”‚
3. Data sent to analytics
   β”‚
4. Recommendation service processes data
   β”‚
5. ML models generate recommendations
   β”‚
6. Recommendations cached
   β”‚
7. Next time user opens YouTube, sees personalized recommendations

Scaling Strategies

1. Horizontal Scaling

Microservices:

  • Each service scales independently
  • Auto-scaling based on load
  • Stateless services

Transcoding:

  • Distributed transcoding servers
  • Queue-based processing
  • Parallel processing

Storage:

  • Distributed object storage
  • Sharded databases
  • Read replicas

2. Caching

Video Caching:

  • Cache popular videos on CDN
  • Pre-populate edge locations
  • Cache metadata and thumbnails

Application Caching:

  • Cache recommendations
  • Cache search results
  • Cache user data

3. Database Sharding

User Data:

  • Shard by user ID
  • Distribute across databases
  • Handle cross-shard queries

Video Data:

  • Shard by video ID or channel
  • Replicate for read scalability
  • Use read replicas

4. Geographic Distribution

Data Centers:

  • Multiple regions globally
  • Route users to nearest region
  • Replicate critical data

CDN:

  • Edge locations worldwide
  • Route to nearest edge
  • Handle regional outages

Key Design Decisions

1. Direct Upload to Cloud Storage

Decision: Upload directly to Cloud Storage, bypassing servers

Rationale:

  • Reduce server load
  • Faster uploads
  • Better scalability
  • Cost effective

Trade-offs:

  • βœ… Better scalability
  • βœ… Faster uploads
  • βœ… Lower server costs
  • ❌ Less control over upload process
  • ❌ Need signed URLs

2. Distributed Transcoding

Decision: Use distributed transcoding pipeline

Rationale:

  • Handle massive upload volume
  • Parallel processing
  • Scalable architecture
  • Fault tolerance

Trade-offs:

  • βœ… High throughput
  • βœ… Scalable
  • βœ… Fault tolerant
  • ❌ More complex
  • ❌ Queue management needed

3. Multiple Encoding Formats

Decision: Encode in multiple formats/resolutions/codecs

Rationale:

  • Support different devices
  • Handle varying bandwidth
  • Future-proof (new codecs)
  • Optimize storage and delivery

Trade-offs:

  • βœ… Better device support
  • βœ… Bandwidth optimization
  • βœ… Future compatibility
  • ❌ Higher encoding costs
  • ❌ More storage required

4. Google Infrastructure

Decision: Leverage Google's infrastructure (Cloud Storage, CDN, BigQuery)

Rationale:

  • Proven at scale
  • Global infrastructure
  • Cost effective
  • Integrated services

Trade-offs:

  • βœ… Proven scalability
  • βœ… Global reach
  • βœ… Cost effective
  • ❌ Vendor lock-in
  • ❌ Less control

Challenges and Solutions

Challenge 1: Massive Storage

Problem: Store petabytes of video data

Solution:

  • Distributed object storage (Cloud Storage)
  • Efficient compression (VP9, AV1)
  • Tiered storage (hot vs cold)
  • Lifecycle management

Challenge 2: Real-time Transcoding

Problem: Process 500+ hours of video uploaded every minute

Solution:

  • Distributed transcoding pipeline
  • Queue-based processing
  • Parallel processing
  • Priority queue for popular videos

Challenge 3: Global Delivery

Problem: Deliver videos to 2B+ users globally with low latency

Solution:

  • Google's global CDN
  • Edge caching
  • Geographic routing
  • Pre-population of popular content

Challenge 4: Recommendations at Scale

Problem: Provide personalized recommendations to 2B+ users

Solution:

  • ML-based recommendation system
  • Distributed model serving
  • Real-time behavior tracking
  • Efficient caching

Challenge 5: Search at Scale

Problem: Search through millions of videos quickly

Solution:

  • Distributed search index
  • ML-based ranking
  • Caching of popular searches
  • Efficient indexing

Monitoring and Observability

Key Metrics

Performance Metrics:

  • Video start time
  • Buffering rate
  • Upload success rate
  • Transcoding time
  • Search latency

Business Metrics:

  • Daily active users
  • Watch time
  • Upload volume
  • Engagement (likes, comments, shares)

Infrastructure Metrics:

  • CDN hit rate
  • Storage usage
  • Transcoding queue depth
  • Server utilization

Alerting

  • Alert on high error rates
  • Alert on CDN issues
  • Alert on transcoding delays
  • Alert on high latency

Best Practices

1. Transcoding Optimization

  • Use efficient codecs (VP9, AV1)
  • Parallel processing
  • Priority queue for popular videos
  • Monitor processing time

2. CDN Strategy

  • Pre-populate popular content
  • Cache at edge locations
  • Monitor CDN performance
  • Optimize cache hit rates

3. Storage Optimization

  • Use efficient compression
  • Tiered storage (hot vs cold)
  • Lifecycle management
  • Deduplication

4. Recommendation System

  • Continuously improve models
  • A/B test new algorithms
  • Monitor recommendation quality
  • Balance exploration vs exploitation

Quick Reference Summary

YouTube: World's largest video sharing platform with 2B+ users.

Key Components:

  • Video upload and storage (Cloud Storage)
  • Distributed transcoding pipeline
  • Global CDN (Google CDN)
  • ML-based recommendation system
  • Powerful search functionality

Key Design Decisions:

  • Direct upload to Cloud Storage
  • Distributed transcoding pipeline
  • Multiple encoding formats
  • Leverage Google's infrastructure

Scaling Strategies:

  • Horizontal scaling of microservices
  • Global CDN distribution
  • Database sharding
  • Geographic distribution

Remember: YouTube's success comes from handling massive scale (500+ hours uploaded/minute) through distributed systems, efficient video processing, and sophisticated recommendation algorithms.


Next Topic: Twitter β†’

Back to: Step 12 Overview | Main Index