Basic Data Collection Methods (10 Types)
- Manual entry
- Sensor collection
- Web crawlers
- Database export
- Log collection
- API calls
- File import
- Image/video collection
- Voice collection
- RFID/NFC
Teleoperation
Game Controllers
- Xbox / PlayStation
Professional Equipment
- 3Dconnexion
- Force Dimension
VR Devices
- HTC Vive
- Manus VR
Stanford ALOHA: Uses modified game controllers with 6-DOF control, achieving sub-millimeter precision
Simulation Collection
Using Unity/Gazebo and physics engines.
OpenAI Dactyl: Generates 100+ years equivalent training data through domain randomization.
Human Demonstration
Wearable devices (IMU, force sensors).
Industrial “hand-by-hand guidance”: Records poses at 100-1000Hz sampling rate.
Internet Data Utilization
Scraping data from YouTube, forums, and social media for multimodal AI training.
Data Collection Guidelines
| Task Complexity | Sample Size |
|---|---|
| Simple tasks | 50-200 demos |
| Medium tasks | 500-2000 |
| Complex tasks | 5000+ |
Key: Diversity (environment, objects, operations) is more important than quantity.