Replica
Replica refers to storing the same data on different physical nodes in a distributed system. Its core idea is to improve system reliability through data redundancy.
Core Advantages
- High Availability: When a node experiences hardware failure, network partition, or software crash, other replica nodes can immediately take over
- Load Balancing: Read requests can be intelligently routed to different replica nodes
- Data Security: Prevents single point of data loss
Technical Implementation
In ClickHouse, replica mechanism is implemented through collaboration of:
- ReplicatedMergeTree Engine: Base table engine with built-in replica sync logic
- ZooKeeper/Keeper Coordination Service: Stores replica metadata and sync status
Configuration Example
<!-- config.xml -->
<remote_servers>
<cluster_3shards_3replicas>
<shard>
<replica>
<host>ch01</host>
<port>9000</port>
</replica>
</shard>
</cluster_3shards_3replicas>
</remote_servers>
Create replicated table:
CREATE TABLE metrics (
timestamp DateTime,
value Float64
) ENGINE = ReplicatedMergeTree(
'/clickhouse/tables/{shard}/metrics',
'{replica}'
)
ORDER BY timestamp;
Distributed Table
Distributed table is a special table type that doesn’t directly store data, but forwards queries to multiple shard or replica tables.
Distributed(cluster_name, database, table [, sharding_key])
Sharding, Replica and Distributed Table Combination
- Sharding: Through sharding, cluster can scale horizontally
- Replica: Through replica, cluster achieves high availability
- Query Strategy: Queries are usually executed through Distributed table, ClickHouse automatically selects a replica to read data
Configuration File
<yandex>
<remote_servers>
<perftest_3shards_1replicas>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>h121.wzk.icu</host>
<port>9000</port>
<user>default</user>
<password>clickhouse@wzk.icu</password>
</replica>
</shard>
</perftest_3shards_1replicas>
</remote_servers>
<zookeeper-servers>
<node index="1">
<host>h121.wzk.icu</host>
<port>2181</port>
</node>
</zookeeper-servers>
<macros>
<shard>01</shard>
<replica>h121.wzk.icu</replica>
</macros>
</yandex>
Demo
Create Local Table
CREATE TABLE test_tiny_log(
id UInt16,
name String
) ENGINE = TinyLog;
Create Distributed Table
CREATE TABLE dis_table(
id UInt16,
name String
) ENGINE = Distributed(perftest_3shards_1replicas, default, test_tiny_log, id);
Insert Data
INSERT INTO dis_table SELECT * FROM test_tiny_log;
Query Data
select count() from dis_table;
Error Quick Reference
| Symptom/Error | Common Root Cause | Quick Fix |
|---|---|---|
| Table doesn’t exist on replica | ON CLUSTER not executed | Add CREATE … ON CLUSTER |
| ZooKeeper session expired | ZK/Keeper flapping | Troubleshoot network/increase timeout |
| Too many parts | Too many small parts | Increase batch size/adjust merge threshold |
| Not enough quorums to write | insert_quorum requirement too high | Lower insert_quorum |
| Readonly replica | Disk full/readonly | Clear disk/remove readonly |