This is article 71 in the Big Data series, introducing Spark cluster core architecture, deployment mode comparisons, and static/dynamic resource management strategies.
Core Architecture Components
Spark cluster consists of three key roles: Driver Program, Cluster Manager, and Executor.
Driver Program
Driver is the entry and control center of Spark application, responsible for three core tasks:
- SparkContext management: Create and maintain SparkContext, provide cluster connection, RDD operation interfaces and job scheduling capabilities
- Task scheduling: Convert user code to DAG (Directed Acyclic Graph) execution plan, then split into Stages and Tasks to distribute to Executors
- Execution monitoring: Track task success/failure status, handle failure retries and resource cleanup
Driver’s complete lifecycle: initialization → execution → result processing → resource cleanup.
Cluster Manager
Cluster Manager is responsible for entire cluster resource allocation and task coordination. Spark supports four mainstream resource managers:
| Manager | Use Case | Core Features |
|---|---|---|
| Standalone | Development/testing | Simple deployment, no extra dependencies |
| YARN | Enterprise big data platform | Deep Hadoop integration, supports multi-tenancy |
| Mesos | Mixed workload clusters | Fine-grained scheduling, strong horizontal scaling |
| Kubernetes | Cloud-native applications | Container orchestration, supports elastic scaling |
In production, clusters coexisting with Hadoop typically choose YARN; cloud-native scenarios prioritize Kubernetes.
Executor
Executor is a JVM process running on Worker nodes, responsible for:
- Executing Tasks assigned by Driver
- Caching intermediate computation results in memory to accelerate iterative computation
- Writing final results back to HDFS or returning to Driver
Deployment Modes
Local Mode
Run on single machine, no cluster needed, suitable for local development and debugging:
spark-shell --master local[*] # Use all CPU cores
spark-shell --master local[4] # Use 4 threads
Cluster Mode (Client vs Cluster)
When submitting to real cluster, two sub-modes:
- Client mode: Driver runs on client node that submits task, logs directly output to terminal, suitable for interactive debugging
- Cluster mode: Driver runs on cluster internal node, client disconnect doesn’t affect job execution, suitable for production
# Client mode submission
spark-submit --master yarn --deploy-mode client --class icu.wzk.App app.jar
# Cluster mode submission
spark-submit --master yarn --deploy-mode cluster --class icu.wzk.App app.jar
Cluster Startup Process
Start Hadoop Cluster
start-all.sh
Start Spark Standalone Cluster
cd /opt/servers/spark-2.4.5/sbin
./start-all.sh
After startup, access http://<master-ip>:8080 to view Spark Master Web UI.
Verify Cluster Status
# Run built-in Pi example to verify cluster is working
run-example SparkPi 10
Resource Management Strategies
Static Resource Allocation
Pre-specify fixed resources in config file or submission command:
spark-submit \
--executor-memory 4g \
--executor-cores 2 \
--num-executors 10 \
--class icu.wzk.App app.jar
Suitable for resource-exclusive, stable load batch processing scenarios.
Dynamic Resource Allocation
Automatically scale Executor count based on actual workload, enable in spark-defaults.conf:
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.minExecutors=2
spark.dynamicAllocation.maxExecutors=20
spark.dynamicAllocation.executorIdleTimeout=60s
Suitable for resource-sharing, fluctuating load interactive or stream processing scenarios, avoids resource idle waste.
Monitoring and Tuning
- Spark UI (port 4040): View Job, Stage, Task execution status and data skew
- History Server: Persist completed job execution logs for post-analysis
- Ganglia/Prometheus: Cluster-level CPU, memory, network monitoring
Reasonably configuring parallelism (spark.default.parallelism) and serialization (Kryo vs Java) are two common tuning entry points.