配置Hadoop环境
Table of Contents
1 下载和并解压 Hadoop
可以直接在清华镜像源下载,解压到对应目录
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/core/hadoop-2.7.7/hadoop-2.7.7.tar.gz mkdir -p /usr/local/java tar xzvf hadoop-2.7.7.tar.gz
2 配置好相关的环境变量
export HADOOP_HOME=${HADOOP_HOME:/usr/local/java/hadoop-2.7.7} export HADOOP_DATA="$HADOOP_HOME/data" export PATH="$HADOOP_HOME/bin:$PATH" alias h0="$HADOOP_HOME/sbin/stop-all.sh" alias h1="$HADOOP_HOME/sbin/start-all.sh"
3 配置 Hadoop
3.1 配置 core-site.xml
添加 HDFS 的端口为 9600
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9600</value> </property> </configuration>
3.2 配置 hdfs-site.xml
添加 HDFS 的 NameNode 和 DataNode 的路径,并将备份副本数量设置为 1
<configuration> <property> <name>dfs.name.dir</name> <value>/usr/local/java/hadoop-2.7.7/data/name</value> </property> <property> <name>dfs.data.dir</name> <value>/usr/local/java/hadoop-2.7.7/data/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
4 配置 Yarn
单机上不一定要启动 Yarn 的调度,一般在集群上使用 Yarn 调度才有价值,单价启动 Yarn 可能还会让程序运行过慢
4.1 配置 mapred-site.xml
cp mapred-site.xml.template mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
4.2 配置 yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
5 格式化 HDFS 文件系统
hdfs namenode -format
6 启动关闭 Hadoop 相关服务
$HADOOP_HOME/sbin/startk-all.sh $HADOOP_HOME/sbin/stop-all.sh
7 网页浏览工具
- NameNode 属性查看 http://localhost:50070
- ResourceManager 属性查看 http://localhost:8088
8 使用自带的示例程序测试
cd $HADOOP_HOME hadoop fs -cp README.txt hdfs://localhost:9000/user/hujinghui hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount README.txt workcount002
➜ hadoop-2.7.7 hadoop fs -ls /user/hujinghui/workcount002/ 20/09/19 14:54:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 2 items -rw-r--r-- 1 hujinghui supergroup 0 2020-09-19 14:52 /user/hujinghui/workcount002/_SUCCESS -rw-r--r-- 1 hujinghui supergroup 1306 2020-09-19 14:52 /user/hujinghui/workcount002/part-r-00000 ➜ hadoop-2.7.7 hadoop fs -cat /user/hujinghui/workcount002/part-r-00000 20/09/19 14:54:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (BIS), 1 (ECCN) 1 (TSU) 1 (see 1