1 现象说明
新搭建的hadoop 3.1.1 的环境,在启动Hadoop时,通过jps目录发现Slave上没有datanode进程。如下:
[cndba@hadoopmaster ~]$ jps
23234 ResourceManager
22998 SecondaryNameNode
23575 Jps
22683 NameNode
[cndba@hadoopslave1 ~]$ jps
9682 Jps
9535 NodeManager
[cndba@hadoopslave2 ~]$ jps
9356 Jps
9199 NodeManager
2 clusterID不匹配导致的问题
网上搜了下,网上的说法都是由于进行hadoop格式化的时候没有事先结束所有进程,或者多次进行了format导致的datanode的clusterID 和 namenode的clusterID不匹配,从而在启动后没有datanode进程。
解决方法有两种:
方案一: 保留现有数据
- 用NameNode节点的~/dfs/name/current/VERSION 中的namenode的clusterID替换所有datanode节点机器中~/dfs/data/current/VERSION中的clusterID。
- 重启启动hadoop:start-all.sh
这种方式不影响现有的数据,避免了重新的格式化。
方案二: 重新格式化
- 执行./stop-all.sh关闭集群
- 删除存放hdfs数据块的文件夹(hadoop/tmp/),然后重建该文件夹
- 删除hadoop下的日志文件logs
- 执行hadoop namenode -format格式化hadoop
- 重启hadoop集群
3 其他情况
我这里属于另外的情况,并不是clusterID不匹配导致的问题。
重新查看了下启动日志,原来是用户名敲错了:
[cndba@hadoopmaster hadoop]$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as cndba in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoopmaster]
Starting datanodes
ERROR: datanode can only be executed by cbdba.
Starting secondary namenodes [hadoopmaster]
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
Starting resourcemanager
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
Starting nodemanagers
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
[cndba@hadoopmaster hadoop]$
hadoop-env.sh 文件中敲成了cbdba:
export HDFS_DATANODE_USER="cbdba"
修改成cndba后继续启动:
[cndba@hadoopmaster hadoop]$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as cndba in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoopmaster]
Starting datanodes
hadoopslave2: ERROR: Cannot set priority of datanode process 12752
hadoopslave1: ERROR: Cannot set priority of datanode process 13164
Starting secondary namenodes [hadoopmaster]
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
Starting resourcemanager
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
Starting nodemanagers
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
[cndba@hadoopmaster hadoop]$
又报错:
hadoopslave2: ERROR: Cannot set priority of datanode process 12752
上从库查看datanode日志:
************************************************************/
2019-01-23 05:23:23,501 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
2019-01-23 05:23:23,619 ERROR org.apache.hadoop.conf.Configuration: error parsing conf hdfs-site.xml
com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 12288 / 0x3000) in epilog; expected '<'
at [row,col,system-id]: [50,17,"file:/home/cndba/hadoop/etc/hadoop/hdfs-site.xml"]
at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:653)
at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2133)
at com.ctc.wstx.sr.BasicStreamReader.closeContentTree(BasicStreamReader.java:2991)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2734)
这里表面上看是权限问题,但从日志看是hdfs-site.xml 配置文件有问题,修改配置文件后重启系统,正常:
[cndba@hadoopmaster hadoop]$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as cndba in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoopmaster]
Starting datanodes
Starting secondary namenodes [hadoopmaster]
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
Starting resourcemanager
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
Starting nodemanagers
WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
[cndba@hadoopmaster hadoop]$
查看进程,也没有问题:
[cndba@hadoopmaster hadoop]$ jps
13030 SecondaryNameNode
12791 NameNode
13271 ResourceManager
13752 Jps
[cndba@hadoopmaster hadoop]$
[cndba@hadoopslave2 logs]$ jps
13587 Jps
13302 DataNode
13422 NodeManager
[cndba@hadoopslave2 logs]$
[root@hadoopslave1 ~]# jps
13876 NodeManager
14026 Jps
13756 DataNode
[root@hadoopslave1 ~]#
版权声明:本文为博主原创文章,未经博主允许不得转载。