Hadoop 3.2.2 记录
2021-04-20
4 min read
安装前
- 务必新建用户及用户组安装,千万不要用 root 用户,否则启动时会遇到非常多权限上的坑
- 环境变量最好设置在 /etc/profile 中,用户目录下的 .bashrc 保持干净
- 设置好免密登陆 SSH
- 关闭系统防火墙
- 安装成功启动前要执行 hdfs namenode -format 格式化
- 启动失败多看日志,并在下一次启动前删除 hdfs 使用 data/tmp 等文件夹
安装步骤
- 下载指定版本的 Hadoop 版本
- 解压至自定义安装目录
- 进入 etc/hadoop 进行相关 xml 配置
- 把配置好的 Hadoop 分发到 Worker 节点
- 启动 HDFS
- 启动 YARN
相关配置
/etc/profile
#set java environment
export JAVA_HOME=/home/hadoop/opt/amazon-corretto-8.282.08.1-linux-x64
export JRE_HOME=/home/hadoop/opt/amazon-corretto-8.282.08.1-linux-x64/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$JAVA_HOME:$PATH
#set hadoop environment
export HADOOP_HOME=/home/hadoop/opt/hadoop-3.2.2
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
#set maven
export MAVEN_HOME=/home/hadoop/opt/apache-maven-3.6.3
export PATH=$MAVEN_HOME/bin:$PATH
#for read hdfs you must set
#https://github.com/tensorflow/examples/blob/master/community/en/docs/deploy/hadoop.md
export HADOOP_HDFS_HOME=/home/hadoop/opt/hadoop-3.2.2
source ${HADOOP_HOME}/libexec/hadoop-config.sh
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${JAVA_HOME}/jre/lib/amd64/server
export CLASSPATH=$(${HADOOP_HOME}/bin/hadoop classpath --glob)
core-site.xml 配置
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master-12ecb6f03:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/opt/hadoop-3.2.2/storage/tmp</value>
</property>
</configuration>
hadoop-env.sh 找到 JAVA_HOME 所在行并修改
export JAVA_HOME=/home/hadoop/opt/amazon-corretto-8.282.08.1-linux-x64
hdfs-core.xml 配置
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/opt/hadoop-3.2.2/storage/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/opt/hadoop-3.2.2/storage/data</value>
</property>
</configuration>
yarn-site.xml 配置
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master-12ecb6f03</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
workers 配置
master-12ecb6f03
worker-19b369d40-1
worker-19b369d40-2
worker-19b369d40-3
mapred-site.xml 配置
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property><name>mapreduce.reduce.env</name><value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value></property>
</configuration>
测试
提交 mapreduce 任务
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar wordcount /input /output