In-Memory Databases
Quick Reference: SQL vs NoSQL | Step 4: Caching
Quick Reference
| Database | Type | Use Case | Persistence | Performance |
|---|---|---|---|---|
| Redis | Key-Value | Caching, sessions, pub/sub | Optional (RDB, AOF) | ~100K ops/sec |
| Memcached | Key-Value | Simple caching | None (volatile) | ~200K ops/sec |
| Apache Ignite | In-Memory | Analytics, caching | Optional | High throughput |
| Hazelcast | In-Memory | Distributed caching | Optional | Low latency |
Clear Definition
In-memory databases store data primarily in RAM rather than on disk, providing extremely low latency (microseconds) and high throughput (hundreds of thousands of operations per second). They're used for caching, session storage, real-time analytics, and as a performance layer in front of traditional databases.
š” Key Insight: In-memory databases trade durability (data can be lost on restart) for speed. Most provide optional persistence mechanisms.
Core Concepts
How In-Memory Databases Work
- RAM Storage: Data stored in system memory (RAM)
- Fast Access: No disk I/O, direct memory access
- Volatility: Data lost on power failure (unless persisted)
- High Throughput: Can handle millions of operations/second
Redis (Remote Dictionary Server)
Features
- Data Structures: Strings, lists, sets, sorted sets, hashes, streams
- Persistence Options: RDB snapshots, AOF (append-only file)
- Replication: Master-slave replication
- Pub/Sub: Publish-subscribe messaging
- Lua Scripting: Server-side scripting
- Transactions: Multi-command transactions
Data Structures
# Strings
SET user:1001 "John Doe"
GET user:1001
# Hashes
HSET user:1001 name "John" email "john@example.com"
HGETALL user:1001
# Lists
LPUSH queue:emails "email1@example.com"
RPOP queue:emails
# Sets
SADD tags:article:123 "redis" "database"
SMEMBERS tags:article:123
# Sorted Sets (for leaderboards)
ZADD leaderboard 100 "player1"
ZREVRANGE leaderboard 0 9
Memcached
Features
- Simple: Key-value store only
- Volatile: No persistence, data lost on restart
- Distributed: Client-side sharding
- Fast: Extremely low overhead
Use Cases
- Simple caching layer
- Session storage
- HTML fragment caching
Use Cases
1. Caching Layer
Problem: Database queries are slow (milliseconds)
Solution: Cache frequently accessed data in Redis/Memcached
Example:
# Check cache first
user = redis.get(f"user:{user_id}")
if not user:
# Cache miss - query database
user = db.query_user(user_id)
redis.setex(f"user:{user_id}", 3600, user) # Cache for 1 hour
2. Session Storage
Problem: Stateless applications need session data
Solution: Store sessions in Redis
Example: User login sessions, shopping cart data
3. Real-Time Leaderboards
Problem: Need to rank users quickly
Solution: Use Redis sorted sets
Example: Game scores, trending topics, top users
4. Rate Limiting
Problem: Prevent abuse and ensure fair usage
Solution: Track request counts in Redis
Example: API rate limiting, login attempt limiting
5. Pub/Sub Messaging
Problem: Real-time notifications and updates
Solution: Redis pub/sub or streams
Example: Chat applications, real-time feeds
6. Distributed Locks
Problem: Prevent race conditions in distributed systems
Solution: Redis SETNX with expiration
Example: Preventing duplicate processing, resource locking
Advantages & Disadvantages
Advantages
ā
Extreme Performance: Microsecond latency, millions of ops/sec
ā
Low Latency: No disk I/O, direct memory access
ā
High Throughput: Can handle massive request volumes
ā
Flexible Data Structures: Rich data types (Redis)
ā
Scalability: Can cluster and shard horizontally
Disadvantages
ā Volatility: Data lost on restart (unless persisted)
ā Cost: RAM is expensive compared to disk
ā Memory Limits: Constrained by available RAM
ā Persistence Overhead: Persistence can impact performance
ā Complexity: Requires careful memory management
Best Practices
1. Cache Strategy
Cache-Aside (Lazy Loading)
def get_user(user_id):
# Check cache
user = cache.get(f"user:{user_id}")
if user:
return user
# Cache miss - load from DB
user = db.get_user(user_id)
cache.set(f"user:{user_id}", user, ttl=3600)
return user
Write-Through
def update_user(user_id, data):
# Update DB
db.update_user(user_id, data)
# Update cache
cache.set(f"user:{user_id}", data, ttl=3600)
Write-Behind (Write-Back)
def update_user(user_id, data):
# Update cache immediately
cache.set(f"user:{user_id}", data, ttl=3600)
# Queue for DB write (async)
queue.enqueue(db.update_user, user_id, data)
2. TTL (Time-To-Live) Management
- Set appropriate expiration times
- Use different TTLs for different data types
- Implement cache warming for critical data
3. Memory Management
- Monitor memory usage
- Set maxmemory policy (eviction policies)
- Use compression for large values
- Implement data expiration
4. Persistence Configuration
Redis Persistence Options:
- RDB: Point-in-time snapshots (faster, less frequent)
- AOF: Append-only file (more durable, slower)
- Both: Maximum durability (recommended for production)
5. High Availability
- Use Redis Sentinel for failover
- Implement Redis Cluster for sharding
- Set up replication (master-slave)
- Monitor replication lag
Common Pitfalls
ā ļø Common Mistake: Using in-memory DB as primary database without persistence.
Solution: Always have a persistent database backing in-memory cache.
ā ļø Common Mistake: Not setting TTLs, leading to memory exhaustion.
Solution: Always set appropriate expiration times.
ā ļø Common Mistake: Caching everything, including rarely accessed data.
Solution: Cache only frequently accessed, expensive-to-compute data.
ā ļø Common Mistake: Not handling cache invalidation properly.
Solution: Implement cache invalidation strategies (TTL, explicit invalidation, versioning).
ā ļø Common Mistake: Ignoring memory limits and eviction policies.
Solution: Configure maxmemory and eviction policy (LRU, LFU, etc.).
Interview Tips
šÆ Interview Focus: Interviewers often ask about caching strategies and performance optimization:
- Cache Patterns: Know cache-aside, write-through, write-behind
- Eviction Policies: Understand LRU, LFU, FIFO
- Consistency: Discuss cache invalidation strategies
- Performance: Know latency and throughput characteristics
- Trade-offs: Understand memory vs disk trade-offs
Common Questions
- "How would you implement a caching layer using Redis?"
- "What's the difference between Redis and Memcached?"
- "How do you handle cache invalidation?"
- "Explain cache eviction policies."
- "How would you design a distributed cache?"
Related Topics
- Step 4: Caching: Detailed caching strategies
- SQL vs NoSQL: Database selection
- Data Replication: Redis replication strategies
Visual Aids
Cache Architecture
āāāāāāāāāāāā
ā Client ā
āāāāāā¬āāāāāā
ā
ā¼
āāāāāāāāāāāāāāā Cache Miss āāāāāāāāāāāā
ā Redis ā āāāāāāāāāāāāāāāāāāāāā¶ ā Database ā
ā (Cache) ā āāāāāāāāāāāāāāāāāāāāā ā ā
āāāāāāāāāāāāāāā Cache Write āāāāāāāāāāāā
ā
ā Cache Hit
ā¼
āāāāāāāāāāāā
ā Response ā
āāāāāāāāāāāā
Redis Persistence Options
āāāāāāāāāāāāāāā
ā Redis ā
ā (Memory) ā
āāāāāāāā¬āāāāāāā
ā
āāāāā“āāāāā
ā ā
āāāā¼āāā āāā¼āāāā
ā RDB ā ā AOF ā
ā ā ā ā
ā Snapshotā Appendā
ā (Periodic)ā (Every write)ā
āāāāāāāā āāāāāāā
Quick Reference Summary
In-Memory Databases: Store data in RAM for microsecond latency and high throughput. Use for caching, sessions, real-time data, and performance optimization.
Redis: Feature-rich in-memory database with persistence options, data structures, and pub/sub. Best for complex caching needs.
Memcached: Simple, fast key-value store without persistence. Best for straightforward caching.
Key Consideration: Always have a persistent database backing your in-memory cache. Use appropriate TTLs and eviction policies.
Previous Topic: Which is Better ā
Next Topic: Data Replication & Migration ā
Back to: Step 2 Overview | Main Index