UP | HOME

配置Hadoop环境

Table of Contents

1 下载和并解压 Hadoop

可以直接在清华镜像源下载,解压到对应目录

wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/core/hadoop-2.7.7/hadoop-2.7.7.tar.gz
mkdir -p /usr/local/java
tar xzvf hadoop-2.7.7.tar.gz

2 配置好相关的环境变量

export HADOOP_HOME=${HADOOP_HOME:/usr/local/java/hadoop-2.7.7}
export HADOOP_DATA="$HADOOP_HOME/data"
export PATH="$HADOOP_HOME/bin:$PATH"
alias h0="$HADOOP_HOME/sbin/stop-all.sh"
alias h1="$HADOOP_HOME/sbin/start-all.sh"

3 配置 Hadoop

3.1 配置 core-site.xml

添加 HDFS 的端口为 9600

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9600</value>
  </property>
</configuration>

3.2 配置 hdfs-site.xml

添加 HDFS 的 NameNode 和 DataNode 的路径,并将备份副本数量设置为 1

<configuration>
  <property>
    <name>dfs.name.dir</name>
    <value>/usr/local/java/hadoop-2.7.7/data/name</value>
  </property>
  <property>
    <name>dfs.data.dir</name>
    <value>/usr/local/java/hadoop-2.7.7/data/data</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>

4 配置 Yarn

单机上不一定要启动 Yarn 的调度,一般在集群上使用 Yarn 调度才有价值,单价启动 Yarn 可能还会让程序运行过慢

4.1 配置 mapred-site.xml

cp mapred-site.xml.template mapred-site.xml
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

4.2 配置 yarn-site.xml

<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
</configuration>

5 格式化 HDFS 文件系统

hdfs namenode -format

6 启动关闭 Hadoop 相关服务

$HADOOP_HOME/sbin/startk-all.sh
$HADOOP_HOME/sbin/stop-all.sh

7 网页浏览工具

  1. NameNode 属性查看 http://localhost:50070
  2. ResourceManager 属性查看 http://localhost:8088

8 使用自带的示例程序测试

cd $HADOOP_HOME
hadoop fs -cp README.txt hdfs://localhost:9000/user/hujinghui
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount README.txt workcount002
➜  hadoop-2.7.7 hadoop fs -ls /user/hujinghui/workcount002/
20/09/19 14:54:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   1 hujinghui supergroup          0 2020-09-19 14:52 /user/hujinghui/workcount002/_SUCCESS
-rw-r--r--   1 hujinghui supergroup       1306 2020-09-19 14:52 /user/hujinghui/workcount002/part-r-00000
➜  hadoop-2.7.7 hadoop fs -cat /user/hujinghui/workcount002/part-r-00000
20/09/19 14:54:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
(BIS),  1
(ECCN)  1
(TSU)   1
(see    1

9 参考链接

Last Updated 2020-09-20 Sun 17:53. Created by Jinghui Hu at 2020-09-19 Sat 13:57.