Visual SLAM Overview

Uses camera as core sensor to achieve SLAM without LiDAR.

Technical Process

  1. Sensor input (monocular, stereo, RGB-D)
  2. Frontend (visual odometry):
    • Feature point method (ORB-SLAM)
    • Direct method (LSD-SLAM, DSO)
  3. Backend optimization (g2o, GTSAM, EKF)
  4. Loop closure detection (DBoW2)
  5. Mapping (sparse/dense)

Classic Solutions

ORB-SLAM Series

  • Feature point V-SLAM
  • Supports monocular/stereo/RGB-D
  • Centimeter-level positioning accuracy

RTAB-Map

  • Real-time appearance-based mapping
  • Innovative memory management (WM/STM/LTM architecture)

VINS-Fusion

  • Tightly coupled visual-inertial SLAM
  • Sliding window optimization
  • Loop closure detection
  • Multi-sensor fusion

LSD-SLAM / DSO

  • Direct method SLAM

OpenVSLAM / ORB-SLAM3

  • Multi-map system
  • Improved IMU fusion

Application Scenarios

  • Indoor robots
  • AR/VR
  • UAVs
  • Service robot navigation
  • Autonomous driving visual positioning module

Technical Challenges

  • Dynamic object interference
  • Weak texture scenes
  • Real-time performance vs accuracy balance