org.apache.spark.shuffle.MetadataFetchFailedException:缺少输出位置洗牌0 [英] org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0

查看:3227
本文介绍了org.apache.spark.shuffle.MetadataFetchFailedException:缺少输出位置洗牌0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我运行一个Spark工作在一种炒作模式。我有大约500个任务,大约500 1 GB GZ COM pressed的文件。我不断收到每个作业,1-2任务,附加的错误的地方之后重新运行数十次(preventing的作业完成)。任何想法有什么问题的意义和如何克服它?

提前感谢!

  org.apache.spark.shuffle.MetadataFetchFailedException:缺少输出位置洗牌0
在org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:384)
在org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:381)
在scala.collection.TraversableLike $$ anonfun $ $图1.适用(TraversableLike.scala:244)
在scala.collection.TraversableLike $$ anonfun $ $图1.适用(TraversableLike.scala:244)
在scala.collection.IndexedSeqOptimized $ class.foreach(IndexedSeqOptimized.scala:33)
在scala.collection.mutable.ArrayOps $ ofRef.foreach(ArrayOps.scala:108)
在scala.collection.TraversableLike $ class.map(TraversableLike.scala:244)
在scala.collection.mutable.ArrayOps $ ofRef.map(ArrayOps.scala:108)
在org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:380)
在org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:176)
在org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher $ .fetch(BlockStoreShuffleFetcher.scala:42)
在org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)
在org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92)
在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
在org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
在org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
在org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
在org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
在org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
在org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
在org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
在org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
在org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
在org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
在org.apache.spark.scheduler.Task.run(Task.scala:56)
在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:196)
在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
在java.lang.Thread.run(Thread.java:722)


解决方案

这发生在我身上时,我给了更多的内存比它的工作节点。因为它没有掉,火花,而试图存储对象为没有更多的记忆中留下洗牌坠毁。

解决方案是要么添加交换,或配置工作/遗嘱执行人除了使用MEMORY_AND_DISK存储级别数挖墙角使用较少的内存。

I'm running a Spark job with in a speculation mode. I have around 500 tasks and around 500 files of 1 GB gz compressed. I keep getting in each job, for 1-2 tasks, the attached error where it reruns afterward dozens of times (preventing the job to complete). Any idea what is the meaning of the problem and how to overcome it?

Many thanks in advance!

org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:384)
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:381)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:380)
at org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:176)
at org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42)
at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

解决方案

This happened to me when I gave more memory to the worker node than it has. Since it didn't have swap, spark crashed while trying to store objects for shuffling with no more memory left.

Solution was to either add swap, or configure the worker/executor to use less memory in addition with using MEMORY_AND_DISK storage level for several persists.

这篇关于org.apache.spark.shuffle.MetadataFetchFailedException:缺少输出位置洗牌0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆