In-Memory Databases

Quick Reference: SQL vs NoSQL | Step 4: Caching


Quick Reference

DatabaseTypeUse CasePersistencePerformance
RedisKey-ValueCaching, sessions, pub/subOptional (RDB, AOF)~100K ops/sec
MemcachedKey-ValueSimple cachingNone (volatile)~200K ops/sec
Apache IgniteIn-MemoryAnalytics, cachingOptionalHigh throughput
HazelcastIn-MemoryDistributed cachingOptionalLow latency

Clear Definition

In-memory databases store data primarily in RAM rather than on disk, providing extremely low latency (microseconds) and high throughput (hundreds of thousands of operations per second). They're used for caching, session storage, real-time analytics, and as a performance layer in front of traditional databases.

šŸ’” Key Insight: In-memory databases trade durability (data can be lost on restart) for speed. Most provide optional persistence mechanisms.


Core Concepts

How In-Memory Databases Work

  1. RAM Storage: Data stored in system memory (RAM)
  2. Fast Access: No disk I/O, direct memory access
  3. Volatility: Data lost on power failure (unless persisted)
  4. High Throughput: Can handle millions of operations/second

Redis (Remote Dictionary Server)

Features

  • Data Structures: Strings, lists, sets, sorted sets, hashes, streams
  • Persistence Options: RDB snapshots, AOF (append-only file)
  • Replication: Master-slave replication
  • Pub/Sub: Publish-subscribe messaging
  • Lua Scripting: Server-side scripting
  • Transactions: Multi-command transactions

Data Structures

# Strings
SET user:1001 "John Doe"
GET user:1001

# Hashes
HSET user:1001 name "John" email "john@example.com"
HGETALL user:1001

# Lists
LPUSH queue:emails "email1@example.com"
RPOP queue:emails

# Sets
SADD tags:article:123 "redis" "database"
SMEMBERS tags:article:123

# Sorted Sets (for leaderboards)
ZADD leaderboard 100 "player1"
ZREVRANGE leaderboard 0 9

Memcached

Features

  • Simple: Key-value store only
  • Volatile: No persistence, data lost on restart
  • Distributed: Client-side sharding
  • Fast: Extremely low overhead

Use Cases

  • Simple caching layer
  • Session storage
  • HTML fragment caching

Use Cases

1. Caching Layer

Problem: Database queries are slow (milliseconds)

Solution: Cache frequently accessed data in Redis/Memcached

Example:

# Check cache first
user = redis.get(f"user:{user_id}")
if not user:
    # Cache miss - query database
    user = db.query_user(user_id)
    redis.setex(f"user:{user_id}", 3600, user)  # Cache for 1 hour

2. Session Storage

Problem: Stateless applications need session data

Solution: Store sessions in Redis

Example: User login sessions, shopping cart data

3. Real-Time Leaderboards

Problem: Need to rank users quickly

Solution: Use Redis sorted sets

Example: Game scores, trending topics, top users

4. Rate Limiting

Problem: Prevent abuse and ensure fair usage

Solution: Track request counts in Redis

Example: API rate limiting, login attempt limiting

5. Pub/Sub Messaging

Problem: Real-time notifications and updates

Solution: Redis pub/sub or streams

Example: Chat applications, real-time feeds

6. Distributed Locks

Problem: Prevent race conditions in distributed systems

Solution: Redis SETNX with expiration

Example: Preventing duplicate processing, resource locking


Advantages & Disadvantages

Advantages

āœ… Extreme Performance: Microsecond latency, millions of ops/sec
āœ… Low Latency: No disk I/O, direct memory access
āœ… High Throughput: Can handle massive request volumes
āœ… Flexible Data Structures: Rich data types (Redis)
āœ… Scalability: Can cluster and shard horizontally

Disadvantages

āŒ Volatility: Data lost on restart (unless persisted)
āŒ Cost: RAM is expensive compared to disk
āŒ Memory Limits: Constrained by available RAM
āŒ Persistence Overhead: Persistence can impact performance
āŒ Complexity: Requires careful memory management


Best Practices

1. Cache Strategy

Cache-Aside (Lazy Loading)

def get_user(user_id):
    # Check cache
    user = cache.get(f"user:{user_id}")
    if user:
        return user
    
    # Cache miss - load from DB
    user = db.get_user(user_id)
    cache.set(f"user:{user_id}", user, ttl=3600)
    return user

Write-Through

def update_user(user_id, data):
    # Update DB
    db.update_user(user_id, data)
    # Update cache
    cache.set(f"user:{user_id}", data, ttl=3600)

Write-Behind (Write-Back)

def update_user(user_id, data):
    # Update cache immediately
    cache.set(f"user:{user_id}", data, ttl=3600)
    # Queue for DB write (async)
    queue.enqueue(db.update_user, user_id, data)

2. TTL (Time-To-Live) Management

  • Set appropriate expiration times
  • Use different TTLs for different data types
  • Implement cache warming for critical data

3. Memory Management

  • Monitor memory usage
  • Set maxmemory policy (eviction policies)
  • Use compression for large values
  • Implement data expiration

4. Persistence Configuration

Redis Persistence Options:

  • RDB: Point-in-time snapshots (faster, less frequent)
  • AOF: Append-only file (more durable, slower)
  • Both: Maximum durability (recommended for production)

5. High Availability

  • Use Redis Sentinel for failover
  • Implement Redis Cluster for sharding
  • Set up replication (master-slave)
  • Monitor replication lag

Common Pitfalls

āš ļø Common Mistake: Using in-memory DB as primary database without persistence.

Solution: Always have a persistent database backing in-memory cache.

āš ļø Common Mistake: Not setting TTLs, leading to memory exhaustion.

Solution: Always set appropriate expiration times.

āš ļø Common Mistake: Caching everything, including rarely accessed data.

Solution: Cache only frequently accessed, expensive-to-compute data.

āš ļø Common Mistake: Not handling cache invalidation properly.

Solution: Implement cache invalidation strategies (TTL, explicit invalidation, versioning).

āš ļø Common Mistake: Ignoring memory limits and eviction policies.

Solution: Configure maxmemory and eviction policy (LRU, LFU, etc.).


Interview Tips

šŸŽÆ Interview Focus: Interviewers often ask about caching strategies and performance optimization:

  1. Cache Patterns: Know cache-aside, write-through, write-behind
  2. Eviction Policies: Understand LRU, LFU, FIFO
  3. Consistency: Discuss cache invalidation strategies
  4. Performance: Know latency and throughput characteristics
  5. Trade-offs: Understand memory vs disk trade-offs

Common Questions

  • "How would you implement a caching layer using Redis?"
  • "What's the difference between Redis and Memcached?"
  • "How do you handle cache invalidation?"
  • "Explain cache eviction policies."
  • "How would you design a distributed cache?"


Visual Aids

Cache Architecture

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  Client  │
ā””ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”˜
     │
     ā–¼
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”      Cache Miss      ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│   Redis     │ ────────────────────▶ │ Database │
│  (Cache)    │ ◀──────────────────── │          │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜      Cache Write      ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
     │
     │ Cache Hit
     ā–¼
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ Response │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Redis Persistence Options

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│   Redis     │
│  (Memory)   │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
       │
   ā”Œā”€ā”€ā”€ā”“ā”€ā”€ā”€ā”€ā”
   │        │
ā”Œā”€ā”€ā–¼ā”€ā”€ā”  ā”Œā”€ā–¼ā”€ā”€ā”€ā”
│ RDB │  │ AOF │
│     │  │     │
│ Snapshot│ Append│
│ (Periodic)│ (Every write)│
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”˜  ā””ā”€ā”€ā”€ā”€ā”€ā”˜

Quick Reference Summary

In-Memory Databases: Store data in RAM for microsecond latency and high throughput. Use for caching, sessions, real-time data, and performance optimization.

Redis: Feature-rich in-memory database with persistence options, data structures, and pub/sub. Best for complex caching needs.

Memcached: Simple, fast key-value store without persistence. Best for straightforward caching.

Key Consideration: Always have a persistent database backing your in-memory cache. Use appropriate TTLs and eviction policies.


Previous Topic: Which is Better ←

Next Topic: Data Replication & Migration →

Back to: Step 2 Overview | Main Index