Spring Boot YARN无法在Hadoop上运行2.8.0客户端无法访问DataNode [英] Spring Boot YARN doesn't run on Hadoop 2.8.0 client cannot access DataNode

查看:225
本文介绍了Spring Boot YARN无法在Hadoop上运行2.8.0客户端无法访问DataNode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行Spring Boot YARN示例( https://spring.io Windows上的/ guides / gs / yarn-basic / 。在 application.yml 我将 fsUri resourceManagerHost 改为指向到我的VM的主机 192.168 ...
但是当我试图运行应用程序时出现Exceprion:

I'm trying to run Spring Boot YARN sample (https://spring.io/guides/gs/yarn-basic/ on Windows). In application.yml I changed fsUri and resourceManagerHost to point to my VM's host 192.168.... But when I'm trying to run application Exceprion appears:

DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection timed out: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1508)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1284)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
[2017-05-27 19:59:49.570] boot - 7728  INFO [Thread-5] --- DFSClient: Abandoning BP-646365587-10.0.2.15-1495898351938:blk_1073741830_1006
[2017-05-27 19:59:49.602] boot - 7728  INFO [Thread-5] --- DFSClient: Excluding datanode DatanodeInfoWithStorage[10.0.2.15:50010,DS-f909ec7a-8374-4cdd-9cfc-0e778810d98c,DISK]
[2017-05-27 19:59:49.647] boot - 7728  WARN [Thread-5] --- DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /app/gs-yarn-basic/gs-yarn-basic-container-0.1.0.jar could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

这意味着无法从我的主机访问DataNode。出于这个原因,我添加到hdfs-site.xml

It means that DataNode isn't accessible from my host machine. For that reason I added to hdfs-site.xml

<property>
  <name>dfs.client.use.datanode.hostname</name>
  <value>true</value>
  <description>Whether clients should use datanode hostnames when
    connecting to datanodes.
  </description>
</property>

但它仍会引发异常。

我的VM上运行了Hadoop 2.8.0。这是conf。文件:

I've got Hadoop 2.8.0 running on my VM. Here's conf. files:

core-site.xml

core-site.xml

<configuration>
   <property>
       <name>fs.defaultFS</name>
       <value>hdfs://0.0.0.0:9000</value>
   </property>

</configuration>

hdfs-site.xml

hdfs-site.xml

    <configuration>
       <property>
           <name>dfs.replication</name>
           <value>1</value>
       </property>
       <property>
           <name>dfs.namenode.name.dir</name>
           <value>/usr/local/hadoop/hadoop-2.8.0/data/namenode</value>
       </property>

       <property>
           <name>dfs.datanode.data.dir</name>
           <value>/usr/local/hadoop/hadoop-2.8.0/data/datanode</value>
       </property>

        <property>
            <name>dfs.permissions.enabled</name>
            <value>false</value>
        </property>

        <property>
           <name>dfs.client.use.datanode.hostname</name>
           <value>true</value>
           <description>Whether clients should use datanode hostnames when
              connecting to datanodes.
           </description>
        </property>
   </configuration>

mapred-site.xml

mapred-site.xml

<configuration>    
   <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
   </property>
</configuration>

yarn-site.xml

yarn-site.xml

<configuration>
    <property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
    </property>
    <property>
       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>8192</value>
    </property>
        <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>8192</value>
    </property>
    <property>
        <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-
           disk-percentage</name>
        <value>99</value>
    </property>    
</configuration>


推荐答案

您的核心网站。 xml 应指向 Namenode 地址但当前指向 0.0.0.0 这意味着所有本地计算机上的地址。这会产生不明确的结果,因为每台机器都应被视为 Namenode

Your core-site.xml should point to Namenode address but currently its pointing to 0.0.0.0 which means all addresses on the local machine. This will create ambiguous result as each machine shall be treated as Namenode.

Namenode 应该只是hadoop集群中的一个。

Namenode should be only one in a hadoop cluster.

Namenode替换 0.0.0.0 ip 主机名应解决您面临的问题。

Replacing the 0.0.0.0 with the Namenode's ip or hostname should resolve the issue you are facing.

这篇关于Spring Boot YARN无法在Hadoop上运行2.8.0客户端无法访问DataNode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆