Not every problem fits neatly into tables and rows. Sometimes your data looks more like a JSON blob, a social graph, or a time-series of sensor readings. NoSQL databases are purpose-built for these shapes. This topic explains what they trade away from relational databases, and what they gain in return.
Imagine a database spread across two data centers. The network cable between them gets cut. Now what? If you want both sides to agree on the latest data (Consistency), you have to stop accepting new requests until the connection is restored. If you want both sides to keep serving requests (Availability), you accept that they might disagree on some values. You cannot have both at once during a network split. This trade-off is the CAP theorem: when the network fails, choose between Consistency and Availability. Partition Tolerance (staying up despite the split) is non-negotiable for any distributed system.
Think of a document store like a folder of JSON files. Each document can have a different structure — one user might have three email addresses, another might have none. There is no rigid schema forcing every row to look identical. Related data is usually embedded directly inside the document rather than spread across separate tables. This makes reads fast (one fetch gets everything) but makes queries across documents more limited. Used by MongoDB, Couchbase, and Firestore.
A wide-column store is like a spreadsheet where different rows can have completely different columns. Data is organized by a row key, and within each row, you can store as many or as few named columns as you want. Columns that belong together are grouped into column families and stored physically close on disk, making it efficient to read just a specific set of columns across millions of rows. This design is a natural fit for time-series data: each device ID is the row key, and each timestamp becomes a column. Used by Cassandra, HBase, and Google Bigtable.
When you update a row in a database that has copies in multiple data centers, those copies do not all update at the exact same instant. For a brief window, different data centers may return different values for the same piece of data. Eventually, all copies will converge to the same value. This is called eventual consistency. It lets the database remain fast and available even across geographic distance. The tradeoff: if you read immediately after a write, you might get the old value. For things like social media likes or view counts, this is fine. For a bank balance, it is not.
Every read returns the most recent write. To achieve this, the database must confirm that all replicas have the update before responding. This means waiting for network round trips, which adds latency. If the network is broken, the database may have to refuse requests entirely rather than risk returning stale data. Examples: PostgreSQL running on a single node, ZooKeeper, HBase.
Every request gets a response, even during failures. The database never says "I am unavailable." The tradeoff is that the response might not reflect the very latest write — a replica that has not yet received the update might serve the old value. The system stays up and responsive, but correctness is slightly relaxed. Examples: CouchDB, Cassandra (at low consistency settings), DynamoDB.
Network partitions — where some servers can no longer communicate with others — are inevitable in any real distributed system. Partition tolerance means the system keeps working despite this. Because partitions cannot be prevented, every distributed database must be partition tolerant. The real choice, then, is not C vs A vs P — it is: when a partition happens, which do you sacrifice: Consistency or Availability?
CAP only describes behavior during a network failure. The PACELC model adds what happens during normal operation: even when there is no partition, you still choose between Latency (respond fast, possibly stale) and Consistency (respond correctly, but slower). DynamoDB optimizes for low latency in normal operation. Google Spanner optimizes for consistency in all cases. This extended view is more useful for everyday database selection than CAP alone.
| Type | Examples | Data Model | Best For | Weak At |
|---|---|---|---|---|
| Document | MongoDB, Couchbase | JSON/BSON docs | Flexible schemas, nested data | Multi-doc transactions, complex JOINs |
| Key-Value | Redis, DynamoDB | Key → blob | Caching, sessions, leaderboards | Querying by value fields |
| Wide-Column | Cassandra, HBase | Row key + dynamic cols | Time-series, write-heavy IoT | Complex queries, JOINs |
| Graph | Neo4j, TigerGraph | Nodes + edges | Social graphs, recommendations | Tabular analytics |
| Search | Elasticsearch | Inverted index | Full-text search, logs, metrics | Transactional writes |
Redis is not just a key-value cache. It is a data structure server. The choice of data structure determines what operations are O(1) vs O(n) and what problems each one solves efficiently.
| Type | Commands | O(n) operations | Use Case |
|---|---|---|---|
| String | GET, SET, INCR, APPEND | String ops on large values | Cache, counters, distributed locks (SET NX EX) |
| List | LPUSH, RPOP, LRANGE | LINDEX, LINSERT | Message queues, activity feeds, job queues |
| Hash | HGET, HSET, HMGET | HGETALL on large hashes | Object storage (user profile), session data |
| Set | SADD, SISMEMBER, SUNION | SMEMBERS on large sets | Unique visitors, tag systems, friend lists |
| Sorted Set | ZADD, ZRANGE, ZRANK | ZRANGEBYLEX on large sets | Leaderboards, rate limiters, time-series indexes |
| Stream | XADD, XREAD, XGROUP | Full stream scan | Event sourcing, CDC, Kafka-like message log |
| Bitmap | SETBIT, GETBIT, BITCOUNT | BITPOS on large bitmaps | Daily active users (1 bit/user/day = 125MB for 1B users) |
| HyperLogLog | PFADD, PFCOUNT | N/A (O(1) always) | Approximate unique count with fixed 12KB memory |
In Cassandra, data modeling is driven by access patterns — not by normalization rules. The partition key determines which node holds the data. The clustering key determines sort order within a partition. Getting this wrong means scatter-gather queries or hot partitions.
In Cassandra, the partition key decides which physical server stores your data. Think of it as the address on a letter: all rows with the same partition key land on the same machine and its replicas. A query that includes the partition key goes straight to that one machine — fast and efficient. A query without it must ask every machine in the cluster — slow and expensive. Good partition keys have many distinct values that are spread evenly: user_id, device_id, or tenant_id. Bad choices are things like status (only a handful of values) or date alone (all writes for today pile onto one server).
Once the partition key routes you to the right server, the clustering key determines the sort order of rows within that partition. This is what makes range queries fast inside a partition. For example, if your partition key is user_id and your clustering key is event_time, then WHERE user_id = 42 AND event_time > '2024-01-01' goes straight to user 42's partition and scans forward in time order — no random searching needed. You can also use a compound partition key like (user_id, month) to prevent any single partition from growing too large over time.
Cassandra has no JOINs. If you need to look up the same data two different ways — for example, "get orders by user" AND "get orders by status" — you must store it in two separate tables, each optimized for one access pattern. Your application writes to both tables whenever data changes. This feels redundant compared to relational design, but it is the deliberate trade-off: Cassandra gives up flexibility in querying in exchange for extremely fast, predictable writes and reads at any scale.
Cassandra lets you dial in how strict you want consistency to be on each operation. CONSISTENCY ONE returns as soon as the first replica responds — fastest, but possibly stale. CONSISTENCY ALL waits for every replica to confirm — slowest, but always up to date. CONSISTENCY QUORUM is the sweet spot: it waits for a majority (more than half) of replicas. If you use QUORUM for both writes and reads, you are guaranteed to see the latest write, because the majority that served the read will always overlap with the majority that acknowledged the write. This is called strong consistency with RF=3 and quorum.
Relational databases represent relationships with foreign keys and JOINs. For shallow relationships (1–2 hops), JOINs are fast. For deep or variable-depth relationships (friendship networks, dependency graphs, recommendation paths), JOIN count grows exponentially — graph databases win decisively.
In Neo4j, relationships between nodes are stored as physical pointers — following a relationship from one node to another is a direct memory jump, not a table scan. Compare this to a relational database: finding "friends of friends" requires two JOINs, each scanning rows to find matches. At five hops deep, the relational JOIN grows enormous. In a graph database, each hop is constant-time regardless of how large the overall graph is. This is why graph databases dominate for social networks, recommendation engines, and fraud detection — these problems are all about following chains of relationships.
Graph databases have their own query language. Neo4j uses Cypher, which is designed to look like the graph you are drawing. The pattern (Alice)-[:KNOWS]->(Bob) reads almost literally: "Alice knows Bob." To find everyone within 3 hops of Alice: MATCH (user:Person {name: 'Alice'})-[:KNOWS*1..3]-(friend:Person) RETURN DISTINCT friend.name. The *1..3 means "follow 1 to 3 KNOWS relationships." Writing this same query in SQL would require three separate self-joins and becomes difficult to read and slow at scale.
Graph databases have clear weaknesses. Aggregating data across all nodes — like "count all users by country" — is not what they are built for; a relational or columnar database does this better. Write throughput is lower than what Cassandra or Kafka can handle. If your main queries are simple lookups like "get user by ID," the overhead of a graph database adds no value. Use a graph database when the core of your problem is navigating multi-step relationships, and the relationships themselves carry data (like the strength of a connection or the distance of a route).
Compare how a query for a specific column is handled differently in a document store (reads entire document) vs a column store (reads only target column).
CAP theorem says a distributed system can guarantee only two of three properties during a network partition. In practice, partition tolerance is non-negotiable at scale, so the real choice is CP (strong consistency, possible unavailability) vs. AP (always available, eventual consistency).
NoSQL doesn't automatically scale better than SQL. PostgreSQL handles millions of rows trivially. Choose NoSQL when you have a specific use case (flexible schema, write-heavy, extreme scale) not based on hype.
Eventual consistency means reads CAN return stale data. For financial transactions, inventory counts, or anything requiring correctness, eventual consistency is not acceptable without additional application-level conflict resolution.
Normalizing documents into separate collections and joining them in application code undoes NoSQL's benefits. Design around access patterns: embed data that's always read together. Reference (FK-style) only when data is large and rarely accessed together.
CONSISTENCY QUORUM (read+write quorum) gives you strong consistency at the cost of availability. CONSISTENCY ONE is eventual. (4) The "eventual" window can be milliseconds (same DC) or seconds (cross-region). Measure it for your workload.$lookup (2 queries), but modular and supports independent comment queries. The right choice depends on your dominant read pattern.TransactWriteItems / TransactGetItems — serializable but limited to 25 items and single-region. Pricing is capacity-unit based — design access patterns to be efficient. Single-table design patterns use composite keys and GSIs (Global Secondary Indexes) to support multiple access patterns without JOINs.ZADD leaderboard 15230 "alice" adds alice with score 15230. ZREVRANK leaderboard "alice" returns alice's rank (0-indexed from highest). ZREVRANGE leaderboard 0 9 WITHSCORES returns the top 10 with scores. Score updates: ZINCRBY leaderboard 500 "alice" atomically adds 500 to alice's score. Why not a SQL table with an index? SQL requires a query + sort + limit — works fine for small leaderboards. For leaderboards with millions of entries updated thousands of times per second, Redis Sorted Set's O(log n) write and O(log n + k) range read are ideal. Combine with Redis HyperLogLog for "unique players today" metrics and a Hash for player metadata (name, avatar URL) — three data structures, three problems solved.