Tag: Canal

8 articles

Big Data 268 - Real-time Warehouse ODS Layer: Writing Kafka Dimension Tables into DIM

Kafka is a distributed streaming platform for high-throughput message passing. In ETL processes, Kafka serves as a data message queue or stream processing source.

Big Data 269 - Real-time Warehouse DIM, DW and ADS: Scala Pipelines to HBase

Original MySQL area table to HBase: Convert area table to region ID, region name, city ID, city name, province ID, province name, and write to HBase.

Big Data #266: Canal Integration with Kafka - Real-time Data Sync

This article introduces Alibaba's open-source Canal tool, which implements Change Data Capture (CDC) by parsing MySQL binlog.

Big Data 267 - Real-Time Warehouse ODS: Lambda and Kappa Architecture

In internet companies, common ODS data includes business log data (Log) and business DB data.

Big Data 265 - Canal Deployment

Canal is an open-source data synchronization tool from Alibaba for MySQL database incremental log parsing and synchronization.

Big Data 263 - Canal Working Principle: Workflow and MySQL Binlog Basics

Canal is an open-source tool for MySQL database binlog incremental subscription and consumption.

Canal Data Sync: Introduction, Background, Principles and Architecture

Alibaba B2B's cross-region business between domestic sellers and overseas buyers drove the need for data synchronization between Hangzhou and US data centers.

Big Data 261 - Real-Time Warehouse Business Table Structure

Realtime data warehouse is a data warehouse system that differs from traditional batch processing data warehouses by emphasizing low latency, high throughput.