MySQL Index Principles and Optimization: B+Tree, Hash, Binary Search Deep Dive

Basic Concepts of Indexes

MySQL’s official definition of index is: A data structure used by storage engine to quickly find records.

Advantages and Disadvantages of Indexes

Advantages

  • Query acceleration
  • Sort optimization
  • Range query
  • Join optimization

Disadvantages

  • Storage overhead
  • Maintenance cost
  • Write delay

Binary Search, also called binary search, is a search algorithm for efficiently finding specified data in an ordered array.

Algorithm Steps:

  1. Set left and right pointers
  2. Calculate middle position mid
  3. Compare middle element with target value
  4. Narrow search range

Hash Structure

Hash bottom layer implementation is implemented by Hash table, storing data based on key-value pairs.

Hash index:

  • Suitable for equi-joins
  • Does not support range queries
  • Memory native Hash index, InnoDB adaptive hash index

InnoDB Adaptive Hash Index (AHI)

  • Automatically creates hash index based on B+Tree in memory
  • Only creates hash index for hot data
  • Advantage: Equi-query requires only one hash calculation

B-Tree

B-Tree is a balanced multi-way search tree:

  • Index values and data are distributed throughout the tree
  • Each node can store multiple index values
  • All leaf nodes are at the same level

B+Tree

B+Tree is a balanced multi-way search tree:

Features

  1. Non-leaf nodes only store index key values, not actual data
  2. Leaf nodes contain complete index key values and corresponding data
  3. Leaf nodes are connected through bidirectional pointers to form an ordered linked list

Advantages

  • More efficient range queries
  • Fewer disk I/O operations
  • Better cache locality
  • More efficient full table scans