Tag: SQL
9 articles
Hive Slowly Changing Dimension Type 2: Order History State Management
Offline data warehouse needs to save order history state at low cost while supporting daily rollback and change analysis.
MySQL ShardingSphere: SQL Parse, Route, Rewrite and Execution Flow
Deep dive into ShardingSphere's sharding flow including SQL parsing, query optimization, SQL routing, SQL rewriting, SQL execution and result merging six major stages wit...
SparkSQL Statements: DataFrame Operations, SQL Queries &
Comprehensive guide to SparkSQL core usage including DataFrame API operations, SQL query syntax, lateral view explode, and Hive integration via enableHiveSupport for meta...
Big Data 84 - SparkSQL Internals: Five Join Strategies & Catalyst Optimizer
This is article 84 in the Big Data series, deeply analyzing SparkSQL kernel's Join strategy auto-selection logic and SQL parsing optimization flow.
SparkSQL Core Abstractions: RDD, DataFrame, Dataset & SparkSession
This is article 81 in the Big Data series, comprehensively introducing Spark's three core data abstractions' features, use cases and mutual conversions.
SparkSQL Operators: Transformation & Action Operations
This is article 82 in the Big Data series, systematically introducing SparkSQL Transformation and Action operators with complete test cases.
SparkSQL Introduction: SQL & Distributed Computing Fusion
Systematic introduction to SparkSQL evolution history, core abstractions DataFrame/Dataset, Catalyst optimizer principle, and practical usage of multi-data source integra...
Hive DDL and DML Operations
Systematic explanation of Hive DDL (database/table creation, internal and external tables) and DML (data loading, insertion, query) operations.
Hive HQL Advanced: Data Import/Export and Query Practice
Deep dive into Hive's multiple data import methods (LOAD/INSERT/External Table/Sqoop), data export methods, and practical usage of HQL query operations like aggregation...