Hadoop Datanode从站没有连接到我的主站 [英] Hadoop Datanode slave is not connecting to my master

查看:176
本文介绍了Hadoop Datanode从站没有连接到我的主站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于很多错误,我无法弄清楚为什么它没有将datanode slave vm连接到我的主vm中。任何建议是受欢迎的,所以我可以尝试。
并开始,其中一个是我的奴隶vm日志中的这个错误:

  WARN org.apache.hadoop .hdfs.server.datanode.DataNode:连接到服务器的问题:ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000 

因此,我无法在主vm中运行我想要的任务:

  hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5 

它给了我这个错误
$ b $ pre $ org.apache.hadoop.ipc.RemoteException(java.io.IOException):File /用户/ ubuntu / QuasiMonteCarlo_1386793331690_1605707775 / in / part0只能复制到0节点而不是minReplication(= 1)。有0个数据节点正在运行,并且在此操作中不包含任何节点。

即使如此, hdfs dfsadmin -report (在主vm)给我所有0

 配置容量:0(0 B)
当前容量:0 (0 B)
DFS剩余:0(0 B)
已使用DFS:0(0 B)
已使用DFS%:NaN%
在复制块下:0
带有损坏副本的块:0
缺失块:0
可用的数据节点:0(总计0,死亡0)

为此,我构建了openstack 3 vms ubuntu,一个用于master和其他奴隶。
在master中,它构建在 etc / hosts中

  127.0.0.1 localhost 
50.50.1.9 ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8
50.50.1.8 slave1
50.50.1.4 slave2

core-site.xml

 < name> ; fs.default.name< /名称> 
< value> hdfs:// ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000< / value>
< name> hadoop.tmp.dir< / name>
<值> /home/ubuntu/hadoop-2.2.0/tmp< /值>

hdfs-site.xml

 <名称>&dfs.replication LT; /名称> 
<值> 3< /值>
<名称> dfs.namenode.name.dir< /名称>
<值>文件:/home/ubuntu/hadoop-2.2.0/etc/hdfs/namenode< /值>
< name> dfs.datanode.data.dir< / name>
<值>文件:/home/ubuntu/hadoop-2.2.0/etc/hdfs/datanode< /值>
<名称> dfs.permissions< /名称>
<值> false< /值>

mapred-site.xml

 <名称>&mapreduce.framework.name LT; /名称> 
<值>纱线< /值>

我的slave vm文件包含每行:slave1和slave2。



所有来自master vm的日志都不包含错误,但是当我使用slave vm时,它会给出连接错误。并且nodemanager在日志里也给我提供了错误:

 启动NodeManager时出错org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException:从ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76 / 50.50.1.8调用到0.0.0.0:8031连接失败异常:java.net.ConnectException:连接被拒绝; 

从我的奴隶机:
core-site.xml

 <名称> fs.default.name< / name> 
< value> hdfs:// ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000< / value>
< name> hadoop.tmp.dir< / name>
<值> /home/ubuntu/hadoop-2.2.0/tmp< /值>

hdfs-site.xml

 <名称>&dfs.namenode.name.dir LT; /名称> 
<值>文件:/home/ubuntu/hadoop-2.2.0/etc/hdfs/namenode< /值>
< / property>
<属性>
< name> dfs.datanode.data.dir< / name>
<值>文件:/home/ubuntu/hadoop-2.2.0/etc/hdfs/datanode< /值>

以及在我的/ etc / hosts上

  127.0.0.1 localhost 
50.50.1.8 ubuntu-e6df65dc -bf95-45ca-bad5-f8ddcc272b76
50.50.1.9 ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8

JPS
主人

  15863 ResourceManager 
15205 SecondaryNameNode
14967 NameNode
16194 Jps



$ p

$ $ p $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ b


解决方案

导致所有的错误都显示出来,主要原因未能掌握连接到奴隶:

 启动NodeManager时出错org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException:从ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76 / 50.50.1.8调用到0.0.0.0:8031连接失败异常:java.net.ConnectException:连接被拒绝; 

基本上, 0.0.0.0:8031 yarn.resourcemanager.resource-tracker.address 的端口,所以我使用lsof -i:8031进行了检查,端口未启用/打开/允许。由于我使用OpenStack(云),添加了8031和显示错误和voilá的其他端口,按照打算。


Because of many error I can't figure it out why it's happening in not connecting datanode slave vm into my master vm. Any suggestion is welcome, so i can try it. And to start, one of them is this error in my slave vm log:

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000

Because of this, I can't run the job that I want in my master vm:

hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5

which give me this error

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/ubuntu/QuasiMonteCarlo_1386793331690_1605707775/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

and even so, the hdfs dfsadmin -report(at master vm) gives me all 0

Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Datanodes available: 0 (0 total, 0 dead)

For that, I build up on openstack 3 vms ubuntu, one for master and others slaves. in master, it's build up in etc/hosts

127.0.0.1 localhost
50.50.1.9 ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8
50.50.1.8 slave1
50.50.1.4 slave2

core-site.xml

<name>fs.default.name</name>
<value>hdfs://ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000</value>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop-2.2.0/tmp</value>

hdfs-site.xml

<name>dfs.replication</name>
<value>3</value>
<name>dfs.namenode.name.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/namenode</value>
<name>dfs.datanode.data.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/datanode</value>
<name>dfs.permissions</name>
<value>false</value>

mapred-site.xml

<name>mapreduce.framework.name</name>
<value>yarn</value>

And my slave vm file contains each line: slave1 and slave2.

All the logs from master vm contains no error, but when I use slave vm, it gives that error to connect. and the nodemanager gives me error too inside the log:

Error starting NodeManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76/50.50.1.8 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused;

From my Slave Machine: core-site.xml

<name>fs.default.name</name>
<value>hdfs://ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8:9000</value>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop-2.2.0/tmp</value>

hdfs-site.xml

<name>dfs.namenode.name.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/ubuntu/hadoop-2.2.0/etc/hdfs/datanode</value>

and on my /etc/hosts

127.0.0.1 localhost
50.50.1.8 ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76
50.50.1.9 ubuntu-378e53c1-3e1f-4f6e-904d-00ef078fe3f8

The JPS master

15863 ResourceManager
15205 SecondaryNameNode
14967 NameNode
16194 Jps

slave

1988 Jps
1365 DataNode
1894 NodeManager

解决方案

The cause all of the error showing, this below error is the main reason not been able to master connect to slave:

Error starting NodeManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From ubuntu-e6df65dc-bf95-45ca-bad5-f8ddcc272b76/50.50.1.8 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused;

Basically, 0.0.0.0:8031 is the port of yarn.resourcemanager.resource-tracker.address, so I checked using lsof -i :8031, the port wasn't enable/open/allowed. Since I'm using OpenStack(cloud), added 8031 and other ports that was showing error and voilá, worked as intend.

这篇关于Hadoop Datanode从站没有连接到我的主站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆