Apache Druid Single-Machine Deployment: Architecture Over...

TL;DR

Scenario: Quickly experience Apache Druid 30.0.0 locally/single-machine, verify real-time and historical queries and console access.
Conclusion: Following single-server quickstart config can smoothly start service; ports and memory/JVM are the most common pitfalls.
Output: Download/extraction and environment variable list, startup command reference, access entry and common error quick reference card.

Version Matrix

Target	Status	Note
Druid 30.0.0 download and extraction	Verified	Complete per commands and screenshots in article
Environment variables (DRUID_HOME/PATH)	Verified	Write to /etc/profile and refresh
Single-machine nano-quickstart startup	Verified	bin/start-nano-quickstart, for lowest config verification
Console access 8888	Verified	Provides example address; note port opening/security group/firewall
micro-quickstart (4C/16G)	Unconfirmed	Recommend also increase jvm.config heap and processing threads
small/medium/large/xlarge	Unconfirmed	Need evaluate JVM and processing/cache parameters based on machine specs and data volume
ZooKeeper 2181 conflict handling	Verified	Avoided by stopping occupying process; can also change port/use external ZK
Batch/stream ingestion (Kafka/HDFS/S3)	Unconfirmed	Can skip for single-machine verification, configure for cluster/production

System Architecture

Apache Druid is an open-source, distributed high-performance real-time analytical database system, designed specifically for fast aggregation and querying of large-scale datasets. It’s particularly suitable for processing time-series data, event data and log data, widely used in internet advertising analysis, online transaction monitoring, network security log analysis and other fields.

Druid’s core architecture uses modular design, consisting of several key components:

Coordinator Node
- Responsible for data segment lifecycle management
- Monitors data node status
- Executes data balancing and replication strategies
Historical Node
- Stores and queries immutable data segments
- Uses memory-mapped files for efficient queries
- Supports multiple compression formats (like LZ4, Zstandard)
Broker Node
- Receives client query requests
- Routes queries to relevant nodes
- Aggregates and returns final results
Ingestion Node
- Real-time data ingestion
- Supports batch and streaming modes
- Provides support for multiple data formats (JSON, CSV, etc.)
Deep Storage
- Persistent storage layer
- Supports HDFS, S3 and other distributed file systems
- Ensures data high availability

These components work together, enabling Druid to achieve sub-second query response times and support data ingestion of millions of events per second.

Core Components

Ingestion Layer

Data Sources: Druid supports multiple data sources like Kafka, HDFS, Amazon S3. Data ingestion can be batch or streaming.
Task Management: Uses task coordinator to manage data ingestion tasks, ensuring smooth and highly available data flow.

Storage Layer

Segment: Druid divides data into multiple chunks called “Segments”. Each segment typically contains data within a time period, optimized for fast queries.
Time Partitioning: Druid partitions data by time to improve query performance. Data is indexed by timestamp, facilitating efficient time range queries.

Query Layer

Broker: Responsible for receiving user query requests and routing them to corresponding data nodes (Historical and Real-time nodes).
Query Execution: Druid supports multiple query types including aggregation queries, filter queries and group-by queries.

Historical Node

Stores and manages long-term data segments, responsible for processing queries on historical data.

Real-time Node

Used for real-time data ingestion, real-time processing and generating queryable segments. Suitable for applications requiring low-latency data access.

Coordinator Node

Responsible for managing Druid cluster nodes, monitoring node health, data distribution and load balancing.

Data Flow

Data Ingestion: Data flows into Druid from external sources (like Kafka message queue), after task management and transformation is ingested.
Data Storage: Data is segmented and stored in Historical and Real-time nodes, partitioned by time and compressed to optimize storage.
Query Processing: Users send query requests through query interface (like SQL or Druid’s specific query language), Broker node distributes requests to corresponding data nodes, aggregates and processes query results, then returns.

Query Optimization

Columnar Storage: Druid uses columnar storage format, improving compression ratio and query performance.
Indexing: Druid creates indexes for each field, accelerating filter and aggregation operations.
Pre-aggregation: Pre-computes commonly used aggregation operations to reduce real-time query computation burden.

Download and Extraction

wget https://dlcdn.apache.org/druid/30.0.0/apache-druid-30.0.0-bin.tar.gz

tar -zxvf apache-druid-30.0.0-bin.tar.gz

mv apache-druid-30.0.0 /opt/servers/
cd /opt/servers/apache-druid-30.0.0
ls

Configuration Files

Configuration files for single-server deployment are located at:

conf/druid/single-server/
├── large
├── medium
├── micro-quickstart
├── nano-quickstart
├── small
└── xlarge

Startup Requirements

Config	CPU	Memory	Startup Command	Config Directory
Nano-Quickstart	1	4GB	bin/start-nano-quickstart	conf/druid/single-server/nano-quickstart/*
Micro Quickstart	4	16GB	bin/start-micro-quickstart	conf/druid/single-server/micro-quickstart/*
Small	8	64GB	bin/start-small	conf/druid/single-server/small/*
Medium	16	128GB	bin/start-medium	conf/druid/single-server/medium/*
Large	32	256GB	bin/start-large	conf/druid/single-server/large/*
LargeX	64	512GB	bin/start-xlarge	conf/druid/single-server/xlarge/*

Environment Variable Configuration

vim /etc/profile

Write the following:

# druid
export DRUID_HOME=/opt/servers/apache-druid-30.0.0
export PATH=$PATH:$DRUID_HOME/bin

After refreshing environment variables, stop other services occupying ports (like ZooKeeper):

zkServer.sh stop

Then start Druid service:

bin/start-nano-quickstart

View Page

Access console:

http://h121.wzk.icu:8888/

Error Quick Reference

Symptom	Root Cause	Location Method	Fix
8888 cannot access	Port not opened or only listening on localhost	ss -lntp/firewall rules, no exception in logs	Open 8888/change binding address; confirm reverse proxy and security group
OOM/GC too high at startup	JVM heap too small, too many processing threads	OutOfMemoryError in var/sv/*/logs	Adjust -Xms/-Xmx in conf/**/jvm.config and processing threads
2181 port occupied	Local ZK or other process already running	lsof -i :2181	Stop occupying process or adjust Druid/ZK port and restart
Console “No datasource found”	Ingestion not completed or segments not loaded	Coordinator/Historical logs and Segments tab	Wait for task completion; check Deep Storage and Historical availability
Query slow/timeout	Time partition/filter miss, off-heap memory insufficient	Broker/Historical logs, query plan	Optimize time filter and dimension index; increase processing threads and memory buffer
Kafka stream ingestion failed	Topic unreachable or auth error	Indexing Service logs contain connection/auth errors	Verify bootstrap.servers and SASL/SSL config, independently test connectivity with kcat
Time range offset	Timezone/timestamp parsing inconsistency	Check timezone/format between ingestion and query	Unify ingestion timestampSpec and query timezone; do transform if needed
Permission/executable error	Executable bit/directory permission insufficient	ls -l bin/, system logs	chmod +x bin/*; ensure running user has read/write permission on install directory
”Config not found”/parameter not taking effect	Not refreshed after modification or path wrong	echo $DRUID_HOME/which druid	Re-source /etc/profile, verify config directory matches actual startup script
Segments not assigned/query missing data	Historical offline or load unbalanced	Coordinator UI segment assignment status	Start/scale Historical; check storage reachability and replica balancing strategy

Summary

Official recommendation for large systems is to use cluster deployment for fault tolerance and resource contention reduction. Single-machine deployment is suitable for development testing and quick verification scenarios. Production environments should use cluster deployment to ensure high availability and performance.