Big Data 136 - ClickHouse Table Engines: TinyLog/Log/StripeLog/Memory/Merge Selection Guide

TL;DR

Scenario: Need to trade-off among small data/temporary table/log landing/multi-table combined reads, often using MergeTree is “using a cannon to kill a mosquito”.
Conclusion: Using this selection table + MRE + concurrency/file verification scripts, run through core differences of 5 engines in 10 minutes.
Output: All detailed testing process of engines, with SQL and other content.

Version Matrix

Item	Value	Note
Verified	ClickHouse Server 23.x/24.x	Single machine + local disk
Client	clickhouse-client	—multiquery demo

90-Second Principles

TinyLog: One .bin compressed file per column, append-write; basically no concurrency handling, reads limited during writes.
Log: Stores markers (offsets) next to columns, supports multi-threaded reads; same table writes block other writes/reads.
StripeLog: Writes all columns to one data.bin, markers centralized, less FD usage, stable read parallelism.
Memory: Pure memory storage, lost on restart; suitable for small volume/testing/high-speed operators.
Merge: Read-only “aggregation view”, stores no data; queries pushed down to underlying table engine.

Selection Decision

Scenario	Recommended Reason	Not Recommended Reason
Small table, one-time write, config/dimension (≤1M rows/table)	TinyLog lightest, files intuitive, read/write simple	MergeTree (overly complex)
Temporary log, batch import then read-only, want parallel reads	Log has markers, concurrent reads, INSERT forms blocks	TinyLog (read locked/poor concurrency)
Many small tables (thousands), want more stable read parallelism	StripeLog single file + markers, fewer file handles, read parallelism more natural	Log (more file fragmentation)
Extreme low-latency small data, pure memory operators	Memory in-memory, >GB/s throughput, testing/operator assembly	Any persistence requirement
Multi-table unified “virtual view” read	Merge regex aggregates multiple tables, parallel read falls back to underlying engine	Need write/need index

Brief Introduction

Table engine (table type) is the core component determining data management and processing in database system. It directly affects database performance, functional features and applicable scenarios. Specifically, table engine determines several key aspects:

Data Storage Mechanism:
- Physical storage format: Row storage (like InnoDB), column storage (like ClickHouse’s MergeTree) or memory storage (like Memory engine)
- Storage location: Can be stored on disk, SSD, memory or distributed file systems
- Read/write path: Some engines first write to memory buffer then persist to disk
Query Support Capabilities:
- Supported query types: Full-text search, aggregation queries, time-series data processing, etc.
- Query optimization methods: Different engines use different query execution plans
- Special function support: Like geospatial data processing, JSON document processing
Concurrency Control:
- Locking mechanisms: Row-level locking, table locking or optimistic concurrency control
- Transaction isolation levels: Read committed, repeatable read, etc.
- MVCC implementation
Index Strategy:
- Index types: B-tree, LSM tree, inverted index, etc.
- Index maintenance: Some engines support automatic maintenance, some require manual
- Index selectivity: Different engines may have significantly different index usage efficiency
Parallel Processing Capabilities:
- Multi-threaded query execution: Some engines can decompose single query into multiple parallel tasks
- Distributed query processing: Execute queries in parallel across nodes in cluster environment
- Resource isolation: Control concurrent query resource consumption
Data High Availability:
- Replication mechanisms: Like primary-replica replication, multi-primary replication
- Fault recovery: Automatic fault transfer and recovery capabilities
- Data consistency guarantees: Strong consistency, eventual consistency, etc.
Special Functions:
- Data TTL (Time To Live)
- Data compression algorithms
- Encrypted storage support
- Materialized view support

ClickHouse is a column-oriented database management system supporting multiple table engines, each with specific functions and uses.

MergeTree Series

MergeTree: Most commonly used table engine, supports efficient partitioning, sorting, indexing, suitable for high-volume write and query scenarios. Supports primary key and indexes.
ReplicatedMergeTree: Based on MergeTree with added replication, suitable for distributed cluster environments.
ReplacingMergeTree: Allows newest records to overwrite old records, very suitable for scenarios needing deduplication by specific columns.
SummingMergeTree: Supports aggregation of numeric columns, suitable for scenarios needing aggregation calculations.
AggregatingMergeTree: Supports more complex aggregation operations, suitable for scenarios needing pre-computed summaries.
CollapsingMergeTree: Used for log-style data, reduces storage by merging “begin” and “end” records.
VersionedCollapsingMergeTree: Adds version number on top of CollapsingMergeTree for better control of data merging.

Log Series

Log: Simple table engine, doesn’t support indexes or partitioning, suitable for small data volume or log-style storage scenarios.
TinyLog: Suitable for embedded scenarios or testing, simpler performance, cannot handle large-scale data.
StripeLog: Suitable for SSD scenarios, writes by row but organizes data by blocks, suitable for certain specific read/write patterns.
Memory: Data stored only in memory, suitable for scenarios needing fast read/write without persistence.
Distributed: Used in distributed clusters, dispatches queries to multiple nodes, suitable for large-scale data and high-concurrency query scenarios.
Merge: Treats multiple tables as a virtual table for queries, suitable for scenarios needing to union multiple tables for reading.
Join: Pre-loads and stores Join tables, used to improve join operation efficiency.

View Series

MaterializedView: Materialized view, allows pre-computation to speed up queries.
View: Regular view, doesn’t store data, just query definition.
Buffer: Temporarily stores data in memory, periodically batch writes to base table, suitable for scenarios needing write performance optimization.
Null: Discards data when written, suitable for testing scenarios.

Log Series Details

TinyLog

Storage Structure and Mechanism

This table engine uses the most basic column-oriented storage, each column data stored as a separate compressed file (like .bin file), uses lightweight compression algorithms (like LZ4 or ZSTD) to reduce disk space. Write operations use append-only mode, all new data is added to the end of corresponding column files without modifying existing data.

Concurrency Limitations

Read Concurrency:
- When write operation is in progress, any concurrent read operation will immediately throw “Table is locked for writing” exception
- Example scenario: If background task is importing data, user query will fail directly
Write Concurrency:
- Multiple simultaneous write operations will cause data file corruption
- Typical problems: Incomplete column data, inconsistent row count, data garbled, etc.

Applicable Scenario Analysis

Best Use Pattern:
- One-time writes (Write-Once-Read-Many)
- Suitable for config tables, dimension tables that don’t change often
- Typical applications: Storing product categories, region codes and other basic data
Capacity Suggestions:
- Official recommended upper limit: 1,000,000 rows
- Actual performance inflection point: About 50-100MB single table data volume
- Small table advantage: Single query usually only needs to open 2-3 column files
Special Scenario Advantages:
- Excellent performance under small table cluster (1,000+ small tables) scenarios
- Can reduce file descriptor usage by over 50% compared to other engines
- Ideal choice for cold data archival storage

Feature Limitations

Index Support:
- Completely doesn’t support any type of index
- All queries are full table scans
- Solution: Consider materialized views for high-frequency query fields
Performance Characteristics:
- Write speed: About 10-50MB/s (depends on hardware)
- Read latency: Linearly related to data volume
- Memory usage: Only needs most basic buffer (usually <10MB)
Maintenance Notes:
- Doesn’t support ALTER TABLE to modify table structure
- Data deletion requires table rebuild
- No automatic compression organization mechanism

Test

Create table with TinyLog engine and insert one piece of data:

CREATE table t (a UInt16, b String) ENGINE = TinyLog;
INSERT INTO t (a, b) VALUES (1, 'abc');

View directory storing data:

cd /var/lib/clickhouse/data/default/t
ls

File list explanation:

a.bin and b.bin are compressed data for corresponding columns
sizes.json records each bin file’s size

Log

Different from TinyLog, Log stores markers in small files alongside column files, these markers are written on each data block and contain offsets indicating where to start reading to skip specified rows. This allows reading table data in multiple threads, for concurrent data access, read operations can execute simultaneously while writes block reads and other writes. Log engine doesn’t support indexes. Similarly, if write fails, table will be corrupted and reading from that table will return error. Log engine is suitable for temporary data, write-once tables and testing or demonstration purposes.

StripeLog

This engine belongs to log engine series, use this engine when you need to write many small data volume (less than 100 million rows) tables.

Write Data

StripeLog engine stores all columns in one file, for each INSERT request, ClickHouse appends data block at end of table file, writes column by column.

ClickHouse writes following files for each table:

data.bin data file
index.mrk marker file, marker contains offsets of each column in each inserted data block

StripeLog engine doesn’t support ALTER, UPDATE, ALTER DELETE operations.

Read Data

With marker files, ClickHouse can read data in parallel, which means SELECT request returns unpredictable row order, use ORDER BY clause to sort rows.

Create Table

CREATE TABLE stripe_log_table (
  timestamp DateTime,
  message_type String,
  message String
) ENGINE = StripeLog;

Insert Data

INSERT INTO stripe_log_table VALUES (now(), 'REGULAR', 'The first reqular message');
INSERT INTO stripe_log_table VALUES
(now(), 'REGULAR', 'The second regular message'),
(now(), 'WARNING', 'The first warning message');

We use two INSERT requests to create two data blocks in data.bin file.

Query Data

ClickHouse uses multiple threads when querying data, each thread reads separate data and independently returns result rows after completion. As a result, in most cases, order of blocks in output is different from order of corresponding blocks in input, for example:

SELECT * FROM stripe_log_table;
# Sort results (default ascending)
SELECT * FROM stripe_log_table ORDER BY timestamp;

Memory

Memory engine, data stored in memory in uncompressed original form, data lost on server restart. Read/write operations don’t block each other, doesn’t support indexes. Has very high performance in simple queries: over 10GB/s. Generally used not much, except for testing, or scenarios needing very high performance but data volume cannot be too large (upper limit about 100 million rows).

Merge

Merge engine (don’t confuse with MergeTree) itself doesn’t store data, but can be used to read data from any number of other tables simultaneously, reads are automatically parallel, doesn’t support writes. When reading, the engine of tables actually having data (if any) will be used.

Merge parameters:

Database name
Table name regex pattern

Create Table

CREATE table t1 (id UInt16, name String) ENGINE = TinyLog;
CREATE table t2 (id UInt16, name String) ENGINE = TinyLog;
CREATE table t3 (id UInt16, name String) ENGINE = TinyLog;

Insert Data

INSERT INTO t1 (id, name) VALUES (1, 'first');
INSERT INTO t2 (id, name) VALUES (2, 'second');
INSERT INTO t3 (id, name) VALUES (3, 'i am in t3');

Create Link

CREATE TABLE t (id UInt16, name String) ENGINE = Merge(currentDatabase(), '^t');

Common Issues

Symptom/Error	Possible Root Cause	Quick Fix
Table is locked for writing	TinyLog read failed (INSERT in progress)	`show processlist`; switch to Log/StripeLog or avoid concurrency
SELECT order chaotic	Multi-threaded read, no marker order guarantee	`EXPLAIN PIPELINE`; `ORDER BY` for explicit sort
Merge can’t find data	Regex not matching/wrong db name	`SELECT name FROM system.tables`; fix `Merge(db, 'regex')`
Directory files don’t match	Version difference (.mrk/.mrk3)	Check `system.build_options`; update docs/verification script by version