Dataproc上的Spark失败,并出现java.io.FileNotFoundException: [英] Spark on Dataproc fails with java.io.FileNotFoundException:

查看:194
本文介绍了Dataproc上的Spark失败,并出现java.io.FileNotFoundException:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Dataproc集群中启动的火花作业失败,但以下异常.我尝试了各种群集配置,但结果是相同的.我在Dataproc映像1.2中遇到此错误.

Spark job launched in Dataproc cluster fails with below exception. I have tried with various cluster configs but the result is same. I am getting this error in Dataproc image 1.2.

注意:没有抢占式工作器,磁盘中也有足够的空间.但是我注意到工作节点上根本没有/hadoop/yarn/nm-local-dir/usercache/root文件夹.但是我可以看到一个名为dr.who的文件夹.

Note: There are no preemptive workers also there is sufficient space in the disks. However I have noticed that there is no /hadoop/yarn/nm-local-dir/usercache/root folder at all in worker nodes. But I can see a folder named dr.who.

java.io.IOException: Failed to create local dir in /hadoop/yarn/nm-local-dir/usercache/root/appcache/application_1534256335401_0001/blockmgr-89931abb-470c-4eb2-95a3-8f8bfe5334d7/2f.
    at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:70)
    at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:80)
    at org.apache.spark.shuffle.IndexShuffleBlockResolver.getDataFile(IndexShuffleBlockResolver.scala:54)
    at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:68)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
    at org.apache.spark.scheduler.Task.run(Task.scala:86)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

可能的副本:在Google的网站上生成火花由于java.io.FileNotFoundException,Dataproc失败:/hadoop/yarn/nm-local-dir/usercache/root/appcache/

推荐答案

我可以使用Dataproc 1.3解决此问题. 但是1.3并没有附带bigquery连接器,需要处理. https://cloud.google.com/dataproc/docs/concepts/connectors/bigquery

I could resolve the issue by using Dataproc 1.3. However 1.3 does not come with bigquery connector which needs to be handled . https://cloud.google.com/dataproc/docs/concepts/connectors/bigquery

这篇关于Dataproc上的Spark失败,并出现java.io.FileNotFoundException:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆