在Yarn集群上运行的Spark作业java.io.FileNotFoundException:文件不退出,因为文件在主节点上退出 [英] Spark Job running on Yarn Cluster java.io.FileNotFoundException: File does not exits , eventhough the file exits on the master node

查看:5240
本文介绍了在Yarn集群上运行的Spark作业java.io.FileNotFoundException:文件不退出,因为文件在主节点上退出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Spark很新。我尝试搜索,但我无法得到一个适当的解决方案。我已经在两个盒子(一个主节点和另一个工作节点)上安装了hadoop 2.7.2,我已经按照以下链接设置了集群 http://javadev.org/docs/hadoop/centos/6/installation/multi-node- installation-on-centos-6-non-sucure-mode /
我以root用户的身份运行hadoop和spark应用程序来测试集群。

我在主节点上安装了spark,并且spark没有任何错误地启动。然而,当我使用spark提交作业时,我得到了File Not Found异常,即使该文件存在于主节点中的错误中的相同位置。我正在执行Spark Submit命令,并在下面找到日志输出命令。

  / bin / spark-submit --class com.test.Engine  - 主纱--deploy-mode群集/应用/spark-test.jar 



 
16/04/21 19:16 :13 WARN NativeCodeLoader:无法为您的平台加载native-hadoop库......在适用的情况下使用builtin-java类
16/04/21 19:16:13 INFO RMProxy:连接到ResourceManager,位于/0.0.0.0 :8032
16/04/21 19:16:14信息客户端:从集群请求1个NodeManagers的新应用程序
16/04/21 19:16:14信息客户端:验证我们的应用程序尚未请求超过集群的最大内存容量(每个容器8192 MB)
16/04/21 19:16:14 INFO客户端:将分配AM容器,包含1408 MB内存,包括384 MB开销
16/04/21 19:16:14信息客户端:为AM
设置容器启动上下文16/04/21 19:16:14信息客户端:为AM容器设置启动环境
16/04/21 19:16:14信息客户:准备资源对于我们的AM容器
16/04/21 19:16:14信息客户端:源文件系统和目标文件系统是相同的。不复制文件:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar
16/04/21 19:16:14信息客户端:源文件系统和目标文件系统是相同的。不复制文件:/app/spark-test.jar
16/04/21 19:16:14信息客户端:源文件系统和目标文件系统是相同的。不复制文件:/ tmp / spark-120aeddc-0f87-4411-9400-22ba01096249 / __ spark_conf__5619348744221830008.zip
16/04/21 19:16:14 INFO SecurityManager:将视图acls更改为:root
16 / 04/21 19:16:14 INFO SecurityManager:将修改的acls更改为:root
16/04/21 19:16:14 INFO SecurityManager:SecurityManager:身份验证已禁用;用户禁用;具有查看权限的用户:Set(root);具有修改权限的用户:Set(root)
16/04/21 19:16:15 INFO客户端:将应用程序1提交给ResourceManager
16/04/21 19:16:15信息YarnClientImpl:提交的应用程序application_1461246306015_0001
16/04/21 19:16:16 INFO客户端:应用程序报告application_1461246306015_0001(州:ACCEPTED)
16/04/21 19:16:16信息客户端:
客户端令牌:不适用
诊断:不适用
ApplicationMaster主机:不适用
ApplicationMaster RPC端口:-1
队列:默认
开始时间:1461246375622
最终状态:UNDEFINEDsparkcluster01.testing.com
跟踪网址:http://sparkcluster01.testing.com:8088/proxy/application_1461246306015_0001/
用户:root
16/04/21 19 :16:17信息客户端:application_1461246306015_0001的申请报告(州:已接受)
16/04/21 19:16:18信息客户:申请报告--1461246306015_0001(州:已接受)
16/04 / 21 19:16:19信息客户:Applicat离子报告application_1461246306015_0001(州:ACCEPTED)
16/04/21 19:16:20 INFO客户:申请报告application_1461246306015_0001(州:已接受)
16/04/21 19:16:21 INFO客户端:应用程序报告application_1461246306015_0001(状态:失败)
16/04/21 19:16:21信息客户端:
客户端令牌:不适用
诊断:应用程序应用程序_1461246306015_0001失败2次到上午的容器appattempt_1461246306015_0001_000002退出,退出码:-1000
。对于更详细的输出,检查应用跟踪页面:http://sparkcluster01.testing.com:8088 /集群/应用/ application_1461246306015_0001Then,点击链接到每个日志尝试。
诊断:java.io.FileNotFoundException:文件文件:/app/spark-test.jar不存在
尝试失败。申请失败。
ApplicationMaster主机:不适用
ApplicationMaster RPC端口:-1
队列:默认
开始时间:1461246375622
最终状态:失败
跟踪URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001
user:root
线程main中的异常org.ap / app / spark-test.jarache.spark.SparkException:应用程序application_1461246306015_0001在org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
处org.apache.spark.deploy.yarn.Client $ .main处以失败状态
完成(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
在java.lang.reflect.Method.invoke(方法.java:498)
at org .apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit。 scala:181)
at org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala: 121)
在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我甚至试图通过将我的应用程序放在HDFS上并在Spark Submit命令中提供HDFS路径来尝试在HDFS文件系统上运行spark。即使在某些Spark Conf文件中抛出File Not Found Exception也是如此。我正在执行Spark Submit命令,请在命令下方找到日志输出。

  ./bin/spark-submit --class com.test.Engine  - 主要纱线--deploy-mode集群hdfs ://sparkcluster01.testing.com:9000 / beacon / job / spark-test.jar 



 
16/04/21 18:11:45信息RMProxy:连接到ResourceManager,位于/0.0.0.0:8032
16/04/21 18:11:46信息客户端:请求一个新的应用程序从具有1节点管理器的集群
16/04/21 18:11:46 INFO客户端:验证我们的应用程序没有请求超过集群的最大内存容量(每个容器8192 MB)
16/04 / 21 18:11:46 INFO客户端:将分配AM容器,包含1408 MB内存,包括384 MB开销
16/04/21 18:11:46信息客户端:为我们的AM $ b设置容器启动上下文$ b 16/04/21 18:11:46 INFO客户端:为我们的AM容器设置启动环境
16/04/21 18:11:46 INFO客户端:为AM容器准备资源
16/04/21 18:11:46信息客户端:源文件系统和目标文件系统是相同的。不复制文件:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar
16/04/21 18:11:47信息客户端:上传资源hdfs://sparkcluster01.testing .com:9000 / beacon / job / spark-test.jar - > file:/root/.sparkStaging/application_1461234217994_0017/spark-test.jar
16/04/21 18:11:49 INFO Client:Source and目标文件系统是相同的。不复制文件:/ tmp / spark -f4eef3ac-2add-42f8 -a204-be7959c26f21 / __ spark_conf__6818051470272245610.zip
16/04/21 18:11:50 INFO SecurityManager:将视图acls更改为:root
16 / 04/21 18:11:50 INFO SecurityManager:将修改的acls更改为:root
16/04/21 18:11:50 INFO SecurityManager:SecurityManager:身份验证已禁用;用户禁用;具有查看权限的用户:Set(root);具有修改权限的用户:Set(root)
16/04/21 18:11:50 INFO客户端:将应用程序17提交给ResourceManager
16/04/21 18:11:50信息YarnClientImpl:提交的应用程序application_1461234217994_0017
16/04/21 18:11:51信息客户:应用程序报告application_1461234217994_0017(州:ACCEPTED)
16/04/21 18:11:51信息客户:
客户令牌:不适用
诊断:不适用
ApplicationMaster主机:不适用
ApplicationMaster RPC端口:-1
队列:默认
开始时间:1461242510849
最终状态:UNDEFINED
跟踪网址:http://sparkcluster01.testing.com:8088/proxy/application_1461234217994_0017/
用户:root
16/04/21 18:11:52 INFO客户:申请报告application_1461234217994_0017(州:ACCEPTED)
16/04/21 18:11:53信息客户:申请报告application_1461234217994_0017(州:ACCEPTED)
16/04/21 18:11 :54信息客户:申请报告n_1461234217994_0017(状态:失败)
16/04/21十八点11分54秒INFO客户:
客户端令牌:N / A
诊断:应用application_1461234217994_0017失败,因为2倍到AM的容器appattempt_1461234217994_0017_000002退出exitCode:-1000
有关更详细的输出,请查看应用程序跟踪页面:http://sparkcluster01.testing.com:8088 / cluster / app / application_1461234217994_0017然后,单击指向每次尝试日志的链接。
诊断:文件文件:/ tmp目录/火花f4eef3ac-2add-42f8-A204-be7959c26f21 / __ spark_conf__6818051470272245610.zip不存在
java.io.FileNotFoundException:文件文件:/ tmp目录/火花f4eef3ac- 2add-42f8-A204-be7959c26f21 / __ spark_conf__6818051470272245610.zip在org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus不存在
(RawLocalFileSystem.java:609)
。在org.apache.hadoop.fs。 RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
在org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
在org.apache.hadoop.fs.FilterFileSystem.getFileStatus( FilterFileSystem.java:421)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access $ 000 (FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload $ 2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload $ 2 .run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native方法)
位于javax.security.auth.Subject.doAs(Subject.java:422)
位于org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation .java:1657)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload .java:62)
在java.util.concurrent.FutureTask.run(FutureTask.java:266)$ b $在java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:511)$在java.util.concurrent.FutureTask.run(FutureTask.java:266)处为b $ b;在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)处为
;在java.util处为
。 concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)$ b $在java.lang.Thread.run(Thread.java:745)

失败尝试。申请失败。
ApplicationMaster主机:N / A
ApplicationMaster RPC端口:-1
队列:默认
开始时间:1461242510849
最终状态:失败
跟踪URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017
user:root
线程main中的异常org.apache.spark.SparkException:应用程序application_1461234217994_0017失败状态
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
at org.apache.spark.deploy.yarn.Client $ .main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
在org.apache.spark.deplo y.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:206)$ b $ at org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/04/21 18:11:55 INFO ShutdownHookManager:Shutdown hook called
16/04/21 18:11 :55 INFO ShutdownHookManager:删除目录/ tmp / spark -f4eef3ac-2add-42f8-a204-be7959c26f21


解决方案

火花配置没有指向正确的hadoop配置目录。 2.7.2的hadoop配置位于文件路径hadoop 2.7.2./etc/hadoop/中,而不是/root/hadoop2.7.2/conf。当我指出HADOOP_CONF_DIR = / root / hadoop2.7.2 / etc / hadoop / spark-env.sh时,spark提交开始工作,File not found异常消失。早些时候它指向/root/hadoop2.7.2/conf(不存在)。如果spark没有指向正确的hadoop配置目录,它可能会导致类似的错误。我认为它可能是一个spark中的错误,它应该优雅地处理它,而不是抛出不明确的错误消息。


I am fairly new to Spark . I tried searching but I couldn't get a proper solution . I have installed hadoop 2.7.2 on two boxes ( one master node and the other worker node) I have setup the cluster by following the below link http://javadev.org/docs/hadoop/centos/6/installation/multi-node-installation-on-centos-6-non-sucure-mode/ I was running hadoop and spark application as root user for testing the cluster.

I have installed the spark on the master node and spark is starting without any errors . However when I submit the job using spark submit I am getting File Not Found exception even though the file is present in the master node in the very same location in the error.I am executing below Spark Submit command and please find the logs output below the command.

/bin/spark-submit  --class com.test.Engine  --master yarn --deploy-mode      cluster /app/spark-test.jar

16/04/21 19:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/04/21 19:16:13 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/04/21 19:16:14 INFO Client: Requesting a new application from cluster with 1 NodeManagers
16/04/21 19:16:14 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/04/21 19:16:14 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
16/04/21 19:16:14 INFO Client: Setting up container launch context for our AM
16/04/21 19:16:14 INFO Client: Setting up the launch environment for our AM container
16/04/21 19:16:14 INFO Client: Preparing resources for our AM container
16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar
16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/app/spark-test.jar
16/04/21 19:16:14 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-120aeddc-0f87-4411-9400-22ba01096249/__spark_conf__5619348744221830008.zip
16/04/21 19:16:14 INFO SecurityManager: Changing view acls to: root
16/04/21 19:16:14 INFO SecurityManager: Changing modify acls to: root
16/04/21 19:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/04/21 19:16:15 INFO Client: Submitting application 1 to ResourceManager
16/04/21 19:16:15 INFO YarnClientImpl: Submitted application application_1461246306015_0001
16/04/21 19:16:16 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED)
16/04/21 19:16:16 INFO Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1461246375622
     final status: UNDEFINEDsparkcluster01.testing.com
     tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461246306015_0001/
     user: root
16/04/21 19:16:17 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED)
16/04/21 19:16:18 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED)
16/04/21 19:16:19 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED)
16/04/21 19:16:20 INFO Client: Application report for application_1461246306015_0001 (state: ACCEPTED)
16/04/21 19:16:21 INFO Client: Application report for application_1461246306015_0001 (state: FAILED)
16/04/21 19:16:21 INFO Client: 
     client token: N/A
     diagnostics: Application application_1461246306015_0001 failed 2 times due to AM Container for appattempt_1461246306015_0001_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001Then, click on links to logs of each attempt.
Diagnostics: java.io.FileNotFoundException: File file:/app/spark-test.jar does not exist
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1461246375622
     final status: FAILED
     tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461246306015_0001
     user: root
Exception in thread "main" org.ap/app/spark-test.jarache.spark.SparkException: Application application_1461246306015_0001 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I even tried running the spark on HDFS file system by placing my application on HDFS and giving the HDFS path in the Spark Submit command. Even then its throwing File Not Found Exception on some Spark Conf file. I am executing below Spark Submit command and please find the logs output below the command.

 ./bin/spark-submit  --class com.test.Engine  --master yarn --deploy-mode cluster hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar

16/04/21 18:11:45 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/04/21 18:11:46 INFO Client: Requesting a new application from cluster with 1 NodeManagers
16/04/21 18:11:46 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/04/21 18:11:46 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
16/04/21 18:11:46 INFO Client: Setting up container launch context for our AM
16/04/21 18:11:46 INFO Client: Setting up the launch environment for our AM container
16/04/21 18:11:46 INFO Client: Preparing resources for our AM container
16/04/21 18:11:46 INFO Client: Source and destination file systems are the same. Not copying file:/mi/spark/lib/spark-assembly-1.6.1-hadoop2.6.0.jar
16/04/21 18:11:47 INFO Client: Uploading resource hdfs://sparkcluster01.testing.com:9000/beacon/job/spark-test.jar -> file:/root/.sparkStaging/application_1461234217994_0017/spark-test.jar
16/04/21 18:11:49 INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip
16/04/21 18:11:50 INFO SecurityManager: Changing view acls to: root
16/04/21 18:11:50 INFO SecurityManager: Changing modify acls to: root
16/04/21 18:11:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/04/21 18:11:50 INFO Client: Submitting application 17 to ResourceManager
16/04/21 18:11:50 INFO YarnClientImpl: Submitted application application_1461234217994_0017
16/04/21 18:11:51 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED)
16/04/21 18:11:51 INFO Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1461242510849
     final status: UNDEFINED
     tracking URL: http://sparkcluster01.testing.com:8088/proxy/application_1461234217994_0017/
     user: root
16/04/21 18:11:52 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED)
16/04/21 18:11:53 INFO Client: Application report for application_1461234217994_0017 (state: ACCEPTED)
16/04/21 18:11:54 INFO Client: Application report for application_1461234217994_0017 (state: FAILED)
16/04/21 18:11:54 INFO Client: 
     client token: N/A
     diagnostics: Application application_1461234217994_0017 failed 2 times due to AM Container for appattempt_1461234217994_0017_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017Then, click on links to logs of each attempt.
Diagnostics: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist
java.io.FileNotFoundException: File file:/tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21/__spark_conf__6818051470272245610.zip does not exist
    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
    at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
    at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
    at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1461242510849
     final status: FAILED
     tracking URL: http://sparkcluster01.testing.com:8088/cluster/app/application_1461234217994_0017
     user: root
Exception in thread "main" org.apache.spark.SparkException: Application application_1461234217994_0017 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1034)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/04/21 18:11:55 INFO ShutdownHookManager: Shutdown hook called
16/04/21 18:11:55 INFO ShutdownHookManager: Deleting directory /tmp/spark-f4eef3ac-2add-42f8-a204-be7959c26f21

解决方案

The spark configuration was not pointing to the right hadoop Configuration directory. The hadoop configuration for 2.7.2 is residing at file path hadoop 2.7.2./etc/hadoop/ rather than /root/hadoop2.7.2/conf. When i pointed HADOOP_CONF_DIR=/root/hadoop2.7.2/etc/hadoop/ under spark-env.sh the spark submit started working and File not found exception disappeared. Earlier it was pointing to /root/hadoop2.7.2/conf (which does not exits). If spark does not points to proper hadoop configuration directory it might results in similar error. I think its probably a bug in spark , it should handle it gracefully rather than throwing ambiguous error messages .

这篇关于在Yarn集群上运行的Spark作业java.io.FileNotFoundException:文件不退出,因为文件在主节点上退出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆