This is article 3 in the Big Data series, covering Hadoop cluster SSH passwordless login configuration and writing file distribution scripts, preparing for cluster startup.
Complete illustrated version: CSDN Original | Juejin
Why Passwordless Login is Needed
When starting Hadoop cluster, the master node connects to each slave node via SSH to execute startup commands. Without passwordless, you’d need to manually enter password each time, and the cluster cannot start automatically.
/etc/hosts Configuration (Important)
If not configured well, nodes won’t authenticate each other, all nodes need to map each other in /etc/hosts:
# Each machine's /etc/hosts needs these three lines
<h121's IP> h121.wzk.icu h121
<h122's IP> h122.wzk.icu h122
<h123's IP> h123.wzk.icu h123
Note for public cloud servers: Use public IP for hostname, not 127.x.x.x.
SSH Passwordless Configuration Steps
1. Generate RSA Keys on All Three Nodes
ssh-keygen -t rsa -b 4096
# Press enter all the way, don't set password
After generation, in ~/.ssh/:
id_rsa: Private keyid_rsa.pub: Public key
2. Distribute Public Key to All Nodes
On each node, execute to distribute your public key to three machines (including itself):
ssh-copy-id h121.wzk.icu
ssh-copy-id h122.wzk.icu
ssh-copy-id h123.wzk.icu
3. Verify Passwordless Login
ssh h122.wzk.icu
# Success if no password required
Cluster File Distribution Script
In cluster operations, often need to synchronize configuration files to all nodes, write an rsync script and put it in /usr/local/bin for global availability:
#!/bin/bash
# Script: xsync, usage: xsync <file or directory>
# Put in /usr/local/bin/xsync and chmod +x
pcount=$#
if((pcount==0)); then
echo "No args..."
exit
fi
p1=$1
fname=$(basename $p1)
pdir=$(cd -P $(dirname $p1); pwd)
echo "------sync $pdir/$fname------"
for host in h121.wzk.icu h122.wzk.icu h123.wzk.icu; do
echo "-------- $host --------"
rsync -rvl $pdir/$fname $host:$pdir
done
Install:
sudo cp xsync /usr/local/bin/
sudo chmod +x /usr/local/bin/xsync
Usage:
xsync /opt/servers/hadoop-2.9.2/etc/hadoop/
Common Pitfalls
- Permission issues:
~/.sshdirectory must be 700,~/.ssh/authorized_keysmust be 600 - Firewall: Public cloud servers need to allow port 22 (usually already open)
- First login prompt: First ssh connection will prompt “Are you sure…”, manually enter yes to confirm before passwordless works
Next article: Big Data 04 - Hadoop Cluster Startup