NoSQL databases trade the relational model for specialized data models that offer horizontal scalability, schema flexibility, and purpose-built access patterns. The four major families — document, key-value, wide-column, and graph — each optimize for different query shapes and consistency trade-offs. Most NoSQL systems operate under BASE semantics (Basically Available, Soft-state, Eventually consistent) rather than strict ACID, though many modern engines (MongoDB 4+, DynamoDB transactions) now support multi-document ACID transactions.

Key Points

  • Document stores (MongoDB, Firestore, Couchbase) embed related data as JSON/BSON sub-documents, optimizing for reads that fetch an entity and its children in a single round-trip.
  • Key-value stores (Redis, DynamoDB in KV mode, Aerospike) provide O(1) lookups but no secondary indexes or range scans on values — the key is everything.
  • Wide-column stores (Cassandra, HBase, Bigtable) organize data as a sparse, distributed map sorted by row key; column families group co-accessed columns physically on disk.
  • Graph databases (Neo4j, Amazon Neptune, TigerGraph) store vertices and edges natively; traversals like "friends-of-friends within 3 hops" run in milliseconds vs multi-join SQL queries.
  • DynamoDB single-table design collapses multiple entity types into one table using composite primary keys and GSIs to avoid expensive JOIN-equivalent operations.
  • Cassandra write path: write to Memtable + CommitLog (durable), flush to SSTables, compact with Leveled or STCS compaction — optimized for write-heavy workloads.
  • MongoDB BSON supports up to 16 MB documents; use GridFS for larger binary objects; document arrays enable atomic updates via $push and $pull operators.
  • Vector databases (Pinecone, Weaviate, Qdrant) are a fifth emerging family: they store high-dimensional embeddings and serve approximate nearest-neighbor (ANN) queries.
TypeExamplesData ModelBest Use CasesWeak At
DocumentMongoDB, Firestore, CouchbaseJSON/BSON documents with nested objectsContent mgmt, catalogs, user profiles, mobile appsMulti-document joins, relational integrity
Key-ValueRedis, DynamoDB, AerospikeOpaque value behind a flat keySessions, caching, leaderboards, shopping cartsComplex queries, range scans on values
Wide-ColumnCassandra, HBase, BigtableRows + dynamic column familiesTime-series, IoT, activity feeds, write-heavy logsAd-hoc queries, secondary access patterns
GraphNeo4j, Neptune, TigerGraphVertices + typed edges + propertiesSocial graphs, fraud detection, recommendationsHigh-volume aggregate analytics
Search EngineElasticsearch, OpenSearch, SolrInverted index over JSON docsFull-text search, log analytics, faceted searchTransactional writes, strong consistency

Real-World Example

LinkedIn uses Apache Cassandra for member activity streams (billions of writes/day). Airbnb uses MongoDB for listing documents where all listing attributes are embedded. Twitter/X uses a mix of MySQL (core tweets) and Redis (timeline cache). Uber uses Cassandra for trip records due to its tunable consistency and multi-region active-active replication.