TL;DR

  • Scenario: 2C4G/2C2G three-node mixed deployment, Druid 30.0.0, Kafka/HDFS/MySQL collaboration.
  • Conclusion: Can run on low config, but core is DirectMemory and processing.buffer convergence and infrastructure accessibility.
  • Output: Step-by-step configuration points, version matrix, common fault quick reference and location steps.

Version Matrix

Component / ConfigVersion / ParameterVerifiedNote
Apache Druid30.0.0 ($DRUID_HOME)YesProcess division: master(coordinator+overlord), data(historical+middleManager), query(broker+router). Mixed deployment on h121/h122/h123.
Metadata StorageMySQL (connector 8.0.19)YesConnector placed in extensions/mysql-metadata-storage; druid.metadata.storage.* points to h122.
Deep StorageHDFS (/druid/segments)YesDepends on core-site default FS; production recommends using hdfs://host:port/ absolute path.
Indexing LogsHDFS (/druid/indexing-logs)YesNeeds Hadoop config in _common to take effect.
ZooKeeperh121,h122,h123:2181Yesdruid.zk.paths.base=/druid, ensure ACL/network reachable.
Kafka Real-time IngestionVersion not markedPartialMiddleManager can connect; recommend stress test before scaling task slots.
JDKNot markedNot verifiedDruid 30 typically recommends Java 11/17; please confirm with production.
Coordinator/Overlord JVM-Xms/-Xmx=512mYesLow throughput usable; management plane prioritizes stability.
Historical JVM-Xms/-Xmx=512m; MaxDirectMemory=1gYesPairs with buffer=50,000,000 convergence.
MiddleManager JVM-Xms/-Xmx=128mYesDemo only; increase when task volume grows.
processing.buffer.sizeBytes50,000,000YesMust satisfy MaxDirectMemory ≈ buffer×(numMergeBuffers+numThreads+1).

Overall Introduction

Apache Druid is a high-performance, distributed columnar storage database, specialized for real-time analysis and querying of large-scale datasets. It’s suitable for OLAP scenarios, especially performing excellently when processing large-scale real-time data streams. Druid’s architecture consists of several components: data ingestion, storage, query and management.

For cluster configuration, Druid typically consists of:

  • Data Ingestion Layer: Uses MiddleManager nodes to handle real-time data ingestion from different data sources (like Kafka, HDFS).
  • Storage Layer: Data is stored on Historical nodes, which manage older data and support efficient queries. Data is stored in columnar format, optimizing query performance.
  • Query Layer: Broker nodes act as query routers, receiving user query requests and distributing them to Historical or Real-time nodes, then aggregating results and returning to users.
  • Coordination Layer: Coordinator nodes manage cluster state and data allocation, ensuring even data distribution and automatic node failure handling.

Druid’s configuration files allow users to customize parameters like JVM settings, memory allocation and data sharding strategies for optimization based on different workloads and performance requirements. Additionally, Druid supports multiple query languages including SQL for flexible data analysis.


Cluster Planning

Cluster deployment distribution:

  • Master Node: Deploy Coordinator and Overlord processes
  • Data Node: Run Historical and MiddleManager processes
  • Query Node: Deploy Broker and Router processes

Actual Deployment:

NodeConfigDeployed Services
h121.wzk.icu2C4GZooKeeper, Kafka, Druid
h122.wzk.icu2C4GZooKeeper, Kafka, Druid, MySQL (built during Hive era)
h123.wzk.icu2C2GZooKeeper, Druid

Environment Variables

vim /etc/profile

Write the following:

# druid
export DRUID_HOME=/opt/servers/apache-druid-30.0.0
export PATH=$PATH:$DRUID_HOME/bin

Configuration Files

Link Hadoop configuration files:

  • core-site.xml
  • hdfs-site.xml
  • yarn-site.xml
  • mapred-site.xml

Link above files to conf/druid/cluster/_common

Execute:

cd $DRUID_HOME/conf/druid/cluster/_common
ln -s $HADOOP_HOME/etc/hadoop/core-site.xml core-site.xml
ln -s $HADOOP_HOME/etc/hadoop/hdfs-site.xml hdfs-site.xml
ln -s $HADOOP_HOME/etc/hadoop/yarn-site.xml yarn-site.xml
ln -s $HADOOP_HOME/etc/hadoop/mapred-site.xml mapred-site.xml
ls

MySQL

Link MySQL driver to: $DRUID_HOME/extensions/mysql-metadata-storage

cd $DRUID_HOME/extensions/mysql-metadata-storage
cp $HIVE_HOME/lib/mysql-connector-java-8.0.19.jar mysql-connector-java-8.0.19.jar
ls

Modify Configuration

vim $DRUID_HOME/conf/druid/cluster/_common/common.runtime.properties

Modify the following:

# Add "mysql-metadata-storage"
druid.extensions.loadList=["mysql-metadata-storage", "druid-hdfs-storage", "druid-kafka-indexing-service", "druid-datasketches", "druid-multi-stage-query"]

# Write each machine's own IP or hostname
# Here is h121 node
druid.host=h121.wzk.icu

# Fill in zk address
druid.zk.service.host=h121.wzk.icu:2181,h122.wzk.icu:2181,h123.wzk.icu:2181
druid.zk.paths.base=/druid

# Comment out previous derby config
# Add mysql config
druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://h122.wzk.icu:3306/druid
druid.metadata.storage.connector.user=hive
druid.metadata.storage.connector.password=hive@wzk.icu

# Comment out local config
# Add HDFS config, use HDFS as deep storage
druid.storage.type=hdfs
druid.storage.storageDirectory=/druid/segments

# Comment out indexer.logs local disk config
# Add indexer.logs for HDFS config
druid.indexer.logs.type=hdfs
druid.indexer.logs.directory=/druid/indexing-logs

coordinator-overlord

Parameter sizes adjust based on actual situation:

vim $DRUID_HOME/conf/druid/cluster/master/coordinator-overlord/jvm.config

Modify as follows:

-server
-Xms512m
-Xmx512m
-XX:+ExitOnOutOfMemoryError
-XX:+UseG1GC
-Duser.timezone=UTC+8
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

historical

vim $DRUID_HOME/conf/druid/cluster/data/historical/jvm.config

Modify as follows:

-server
-Xms512m
-Xmx512m
-XX:MaxDirectMemorySize=1g
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC+8
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Also one parameter:

vim $DRUID_HOME/conf/druid/cluster/data/historical/runtime.properties

Modify as follows:

# Equivalent to 50MiB
druid.processing.buffer.sizeBytes=50000000

Note:

  • druid.processing.buffer.sizeBytes: Size of off-heap hash table for aggregation per query
  • maxDirectMemory = druid.processing.buffer.sizeBytes * (durid.processing.numMergeBuffers + druid.processing.numThreads + 1)
  • If druid.processing.buffer.sizeBytes is too large, need to increase maxDirectMemory, otherwise historical service cannot start

middleManager

vim $DRUID_HOME/conf/druid/cluster/data/middleManager/jvm.config

Config as follows (not modified):

-server
-Xms128m
-Xmx128m
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC+8
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Error Quick Reference

SymptomRoot CauseLocationFix
Historical exits on start/Direct buffer OOMBuffer too large doesn’t match MaxDirectMemoryCheck historical logs for “Cannot allocate memory / direct buffer” hint; verify runtime.properties vs jvm.configLower druid.processing.buffer.sizeBytes or increase -XX:MaxDirectMemorySize; converge per formula (buffer ×(merge+threads+1)).
”No suitable driver / MySQL driver not found”Connector not in correct directory or naming wrongCheck $DRUID_HOME/extensions/mysql-metadata-storageEnsure mysql-connector-java-8.0.19.jar exists and restart takes effect.
Deep storage write failure / “No FileSystem for scheme: hdfs”Hadoop client dependency/config not effectiveCheck middleManager/historical logs; verify *-site.xml under _commonEnsure druid-hdfs-storage loaded; soft-link Hadoop config, add Hadoop client dependencies if needed.
Broker query 500 / “No servers found”No available Historical/Realtime nodes or segments not loadedCheck Coordinator Segments/Rules in Web console; Broker logsStart Historical/Realtime; confirm segments loaded and routing rules valid.
ZK connection timeout / ConnectionLosszk address/port wrong or network unreachablezookeeper client zkCli direct connection testFix druid.zk.service.host; open 2181; ensure /druid exists.
MiddleManager task stuck/logs not falling to HDFSHDFS directory permission/quota issuehdfs dfs -ls/-chmod check directory; task log errorsGrant permission/pre-create directory; adjust fs.permissions.umask-mode if needed.
Time column offset 8 hoursJVM timezone parameter non-standardCheck timezone info in query service and task logsSet -Duser.timezone to Asia/Shanghai (or GMT+08:00), verify ingestion spec timestampSpec.
HDFS path resolution exceptionUsing relative path but defaultFS not configuredCheck fs.defaultFS in core-site.xmlChange storage directory to hdfs://host:port/… or add defaultFS.
Hostname unreachable / DNS failureUsing internal domain name not resolvedping/host verify h121.wzk.icu etcWrite mapping to /etc/hosts or use IP; sync modify druid.host.

【To be continued in Part 2】