Rate Limiting & Algorithms

Quick Reference: Load Balancing | Proxy


Quick Reference

AlgorithmHow it WorksProsCons
Token BucketTokens added at rate, consumed per requestAllows burstsComplex
Leaky BucketRequests leak at constant rateSmooths trafficNo bursts
Sliding WindowCount requests in time windowAccurateMemory intensive
Fixed WindowCount requests per time periodSimpleBoundary issues

Clear Definition

Rate Limiting controls the rate of requests from clients to prevent abuse, ensure fair usage, and protect system resources.

šŸ’” Key Insight: Choose algorithm based on whether you need burst handling, accuracy, or simplicity.


Core Concepts

Token Bucket

  • Tokens added at fixed rate
  • Request consumes token
  • Allows bursts (if tokens available)
  • Example: API rate limiting

Sliding Window

  • Track requests in time window
  • More accurate than fixed window
  • Memory intensive
  • Example: Distributed rate limiting

Use Cases

  1. API Protection: Prevent abuse
  2. DDoS Mitigation: Limit attack traffic
  3. Fair Usage: Ensure equal access
  4. Cost Control: Limit expensive operations

Best Practices

  1. Choose Right Algorithm: Token bucket for bursts, sliding window for accuracy
  2. Distributed Systems: Use Redis for shared state
  3. Headers: Return rate limit headers (X-RateLimit-*)
  4. Graceful Handling: Return 429 Too Many Requests

Quick Reference Summary

Rate Limiting: Control request rate to prevent abuse.

Algorithms: Token bucket (bursts), Sliding window (accurate), Fixed window (simple).

Key: Implement at API gateway or reverse proxy level.


Previous Topic: Proxy ←

Back to: Step 6 Overview | Main Index