Apache Hadoop2 Cluster Cookie

环境与基础

  • Debian 11
  • OpenJDK 11
  • Hadoop 2.10.1

部署方式

node1 为 Hadoop NameNode SecondaryNameNode

node2 与 node3 为DataNode

所有机器 /etc/hosts 文件

1
2
3
172.16.11.6 node1
172.16.11.7 node2
172.16.11.8 node3

配置ssh-key

在每台机器node1, node2, node3 中复制ssh-key

1
2
3
4
ssh-keygen -b 4096
ssh-copy-id dev@node1
ssh-copy-id dev@node2
ssh-copy-id dev@node3

安装JDK11

每台机器安装JDK11

1
2
apt-get update && apt-get upgrade
apt-get install openjdk-11-jre openjdk-11-jdk

查看JDK安装位置

1
update-alternatives  --display java

安装Hadoop

hadoop即将安装至每台机器的 /data/hadoop-2.10.1/

下载Hadoop程序(node1)

1
2
3
4
mkdir /data/
cd /data
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.10.1/hadoop-2.10.1.tar.gz
tar xvf hadoop-2.10.1.tar.gz

配置hadoop启动环境中的JavaHome变量

vim /data/hadoop-2.10.1/etc/hadoop/hadoop-env.sh

追加

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

配置 core-site.xml

vim /data/hadoop-2.10.1/etc/hadoop/core-site.xml

1
2
3
4
5
6
7
8
9
10
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://node1:9000</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>

PS: 注意 在node1的hosts中node1应指向外网IP如:172.16.11.6,不应是127.0.0.1

配置 hdfs-site.xml

vim /data/hadoop-2.10.1/etc/hadoop/hdfs-site.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/data/dfs/name/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/data/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

配置 mapred-site.xml

vim /data/hadoop-2.10.1/etc/hadoop/mapred-site.xml

1
2
3
4
5
6
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>node1:9001</value>
</property>
</configuration>

将hadoop程序及配置打包后SCP至node2,node3

1
2
3
4
cd /data
tar czf hadoop-2.10.1.tar.gz hadoop-2.10.1
scp hadoop-2.10.1.tar.gz dev@node2:/data/
scp hadoop-2.10.1.tar.gz dev@node3:/data/

node1

vim /data/hadoop-2.10.1/etc/hadoop/hadoop-master

添加

node1

node2 and node3

cd /data/
tar xvf hadoop-2.10.1.tar.gz
vim /data/hadoop-2.10.1/etc/hadoop/slaves

添加

node2
node3

NameNode Format

登录 node1

/data/hadoop-2.10.1/bin/hadoop namenode -format

启动

登录node1,node2,node3

/data/hadoop-2.10.1/sbin/start-all.sh

node1

  • NameNode
  • SecondaryNameNode
  • NodeManager
  • ResourceManager
  • DataNode

node2&node3

  • DataNode

引用