Tag: etl
6 articles
Data Warehouse Introduction: Four Characteristics, OLTP v...
2026 engineering practice, covering core concepts and implementation concerns for data warehouses: starting from enterprise data silos, explaining four...
Sqoop Incremental Import and CDC Change Data Capture Prin...
Introduce Sqoop's --incremental append incremental import mechanism, and deeply explain CDC (Change Data Capture) core concepts, capture method comparisons, and modern solutions like Flink CDC, Deb...
Sqoop Partial Import: --query, --columns, --where Three F...
Detailed explanation of three ways Sqoop imports partial data from MySQL to HDFS by condition: custom query, specify columns, WHERE condition filtering, with applicable scenarios and precautions.
Sqoop and Hive Integration: MySQL ↔ Hive Bidirectional Da...
Demonstrates Sqoop importing MySQL data directly to Hive table, and exporting Hive data back to MySQL, covering key parameters like --hive-import, --create-hive-table usage.
Sqoop Data Migration ETL Tool Introduction and Installation
Introduction to Apache Sqoop core principles, use cases, and installation configuration steps on Hadoop cluster, helping quickly get started with batch data migration between MySQL and HDFS/Hive.
Sqoop Practice: MySQL Full Data Import to HDFS
Complete example demonstrating Sqoop importing MySQL table data to HDFS, covering core parameter explanations, MapReduce parallel mechanism, and execution result verification.