CAP Theorem

Quick Reference: Data Consistency | Priority Framework


Quick Reference

System TypeCAPExample
CASingle-node databases
CPMongoDB, HBase
APCassandra, DynamoDB
CAPImpossible

Note: In distributed systems, you can only guarantee 2 out of 3.


Clear Definition

The CAP Theorem states that in a distributed system, you can only guarantee two out of three properties simultaneously:

  • Consistency: All nodes see the same data at the same time
  • Availability: System remains operational
  • Partition tolerance: System continues despite network failures

💡 Key Insight: In a distributed system, network partitions are inevitable. You must choose between Consistency (CP) or Availability (AP) when a partition occurs.


Core Concepts

The Three Properties

Consistency (C)

  • All nodes see same data simultaneously
  • Strong consistency guarantee
  • No stale reads

Availability (A)

  • System remains operational
  • Every request gets a response
  • No downtime

Partition Tolerance (P)

  • System continues despite network failures
  • Nodes can't communicate but system works
  • Inevitable in distributed systems

CAP Trade-offs

CP Systems (Consistency + Partition Tolerance)

Characteristics:

  • Prioritize consistency over availability
  • During partition: Reject requests or return errors
  • Examples: MongoDB, HBase, traditional databases

Use Cases: When data accuracy critical (financial systems)

AP Systems (Availability + Partition Tolerance)

Characteristics:

  • Prioritize availability over consistency
  • During partition: Continue serving, may return stale data
  • Examples: Cassandra, DynamoDB, CouchDB

Use Cases: When availability critical (social media, DNS)

CA Systems (Consistency + Availability)

Characteristics:

  • Single-node systems (no partition tolerance)
  • Not truly distributed
  • Examples: Single MySQL instance

Note: Not applicable to distributed systems


Real-World Examples

CP Systems

  1. MongoDB: Consistency prioritized, may become unavailable during partition
  2. HBase: Strong consistency, sacrifices availability
  3. Traditional RDBMS: ACID guarantees, single-node or synchronous replication

AP Systems

  1. Cassandra: High availability, eventual consistency
  2. DynamoDB: Always available, eventually consistent
  3. DNS: High availability, eventual consistency acceptable

Best Practices

  1. Understand Your Requirements: Choose CP or AP based on needs
  2. Partition Tolerance: Always required in distributed systems
  3. Hybrid Approaches: Use different models for different data
  4. Monitor Partitions: Detect and handle partitions gracefully

Common Pitfalls

⚠️ Common Mistake: Thinking you can have all three (CAP).

Solution: In distributed systems, partitions happen. You must choose CP or AP.

⚠️ Common Mistake: Choosing wrong trade-off for use case.

Solution: Understand requirements. Financial = CP, Social media = AP.


Interview Tips

🎯 Interview Focus: Explain CAP theorem and trade-offs.

Common Questions

  • "Explain the CAP theorem."
  • "What does it mean to be CP vs AP?"
  • "Give examples of CP and AP systems."
  • "Can you have all three? Why or why not?"


Quick Reference Summary

CAP Theorem: In distributed systems, choose 2 of 3: Consistency, Availability, Partition Tolerance.

CP Systems: Consistency + Partition Tolerance (MongoDB, HBase)

AP Systems: Availability + Partition Tolerance (Cassandra, DynamoDB)

Key: Partition tolerance required in distributed systems. Choose CP or AP based on requirements.


Previous Topic: Isolation Levels ←

Next Topic: Priority Framework →

Back to: Step 3 Overview | Main Index