上传资源文件时,createBlockOutputStream中出现Spark HDFS异常 [英] Spark HDFS Exception in createBlockOutputStream while uploading resource file

查看:209
本文介绍了上传资源文件时,createBlockOutputStream中出现Spark HDFS异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用yarn-cluster在群集中运行我的JAR,但过了一会儿我遇到了异常.失败前的最后一个INFOUploading resource.我已经检查了所有安全组,hsdf ls是否成功,但仍然收到错误.

I'm trying to run my JAR in the cluster with yarn-cluster but i'm getting an exception after a while. The last INFO before it fails is Uploading resource. I've check all the security groups, did hsdf ls with success but still getting the error.

./bin/spark-submit --class MyMainClass --master yarn-cluster/tmp/myjar-1.0.jar myjarparameter

./bin/spark-submit --class MyMainClass --master yarn-cluster /tmp/myjar-1.0.jar myjarparameter

16/01/21 16:13:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/21 16:13:52 INFO client.RMProxy: Connecting to ResourceManager at yarn.myserver.com/publicip:publicport
16/01/21 16:13:53 INFO yarn.Client: Requesting a new application from cluster with 10 NodeManagers
16/01/21 16:13:53 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (13312 MB per container)
16/01/21 16:13:53 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/01/21 16:13:53 INFO yarn.Client: Setting up container launch context for our AM
16/01/21 16:13:53 INFO yarn.Client: Preparing resources for our AM container
16/01/21 16:13:54 INFO yarn.Client: Uploading resource file:/opt/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar -> hdfs://hdfs.myserver.com/user/henrique/.sparkStaging/application_1452514285349_6427/spark-assembly-1.2.0-hadoop2.3.0.jar
16/01/21 16:14:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/PRIVATE_IP:50010]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
    at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1341)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1167)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1122)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:522)
16/01/21 16:14:55 INFO hdfs.DFSClient: Abandoning BP-26920217-10.140.213.58-1440247331237:blk_1132201932_58466886
16/01/21 16:14:55 INFO hdfs.DFSClient: Excluding datanode 10.164.16.207:50010
16/01/21 16:15:55 INFO hdfs.DFSClient: Exception in createBlockOutputStream

./bin/hadoop fs -ls/user/henrique/.sparkStaging/

./bin/hadoop fs -ls /user/henrique/.sparkStaging/

drwx------- henrique supergroup 0 2016-01-20 18:36 user/henrique/.sparkStaging/application_1452514285349_5868
drwx------ henrique supergroup 0 2016-01-21 16:13 user/henrique/.sparkStaging/application_1452514285349_6427
drwx------ henrique supergroup 0 2016-01-21 17:06 user/henrique/.sparkStaging/application_1452514285349_6443

推荐答案

已解决! Hadoop试图连接到私有IP.通过将此配置添加到hsdf-site.xml

SOLVED! Hadoop was trying to connect to private IPs. The problem was solved by adding this config to hsdf-site.xml

<property>
  <name>dfs.client.use.datanode.hostname</name>
  <value>true</value>
</property>   

这篇关于上传资源文件时,createBlockOutputStream中出现Spark HDFS异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆