NoSQL Overview

NoSQL (Not Only SQL) is a class of database systems that complement traditional relational databases.

Core Advantages

  1. Performance advantage: Achieves high throughput through distributed architecture
  2. Flexible data models: Document, key-value, column-family, graph databases
  3. Horizontal scaling capability

Four Major Families

  1. Column Storage - HBase

    • Designed based on Google Bigtable paper
    • Suitable for massive structured data storage
    • Typical applications: Financial transaction records, telecom call data
  2. Key-Value Storage - Redis

    • In-memory database with persistence support
    • Typical applications: Session cache, ranking systems
    • Features: Ultra-high performance (100k+ QPS)
  3. Graph Storage - Neo4j

    • Native graph database
    • Typical applications: Social networks, recommendation systems, fraud detection
  4. Document Storage - MongoDB

    • BSON format storage
    • Typical applications: Content management systems, product catalogs

MongoDB Introduction

MongoDB is an open-source NoSQL database using document data model.

Notable Features

  1. Document storage structure: Uses BSON format to store data
  2. Powerful query capabilities: Supports rich query expressions and indexes
  3. Distributed architecture: Natively supports sharding and replica sets

Architecture

  1. Storage engine: WiredTiger (default), In-Memory
  2. Query processing layer: Supports CRUD, aggregation pipeline, text search
  3. Sharding and replication layer: Replica sets, sharded clusters
  4. Security system: SCRAM authentication, RBAC, transport encryption

Comparison with Traditional Relational Databases

FeatureMongoDBRelational Database
Data ModelDocument model, schema-lessTable structure, requires predefined schema
ScalabilityEasy horizontal scalingPrimarily vertical scaling
Transaction Support4.0+ supports multi-document ACIDFull ACID transactions
Join OperationsAchieved through embedded documentsAchieved through JOIN

Use Cases

More Suitable for MongoDB:

  • Rapid iterative development
  • Semi-structured or unstructured data
  • Requires high scalability
  • Geospatial data processing

More Suitable for Relational Database:

  • Requires complex transaction processing
  • Stable data structure
  • Requires complex JOIN operations