MySQL Index Principles and Optimization: B+Tree, Hash, Binary Search Deep Dive
Basic Concepts of Indexes
MySQL’s official definition of index is: A data structure used by storage engine to quickly find records.
Advantages and Disadvantages of Indexes
Advantages
- Query acceleration
- Sort optimization
- Range query
- Join optimization
Disadvantages
- Storage overhead
- Maintenance cost
- Write delay
Binary Search
Binary Search, also called binary search, is a search algorithm for efficiently finding specified data in an ordered array.
Algorithm Steps:
- Set left and right pointers
- Calculate middle position mid
- Compare middle element with target value
- Narrow search range
Hash Structure
Hash bottom layer implementation is implemented by Hash table, storing data based on key-value pairs.
Hash index:
- Suitable for equi-joins
- Does not support range queries
- Memory native Hash index, InnoDB adaptive hash index
InnoDB Adaptive Hash Index (AHI)
- Automatically creates hash index based on B+Tree in memory
- Only creates hash index for hot data
- Advantage: Equi-query requires only one hash calculation
B-Tree
B-Tree is a balanced multi-way search tree:
- Index values and data are distributed throughout the tree
- Each node can store multiple index values
- All leaf nodes are at the same level
B+Tree
B+Tree is a balanced multi-way search tree:
Features
- Non-leaf nodes only store index key values, not actual data
- Leaf nodes contain complete index key values and corresponding data
- Leaf nodes are connected through bidirectional pointers to form an ordered linked list
Advantages
- More efficient range queries
- Fewer disk I/O operations
- Better cache locality
- More efficient full table scans