使用“请求发送地图输出位置以随机播放"的Spark Indefinite Waiting [英] Spark Indefinite Waiting with "Asked to send map output locations for shuffle"

查看:246
本文介绍了使用“请求发送地图输出位置以随机播放"的Spark Indefinite Waiting的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的工作经常因以下消息而挂起:

My jobs often hang with this kind of message:

14/09/01 00:32:18 INFO spark.MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 0 to spark@*:37619

如果有人能在发出此消息时解释Spark的功能,那将很棒.此消息是什么意思?用户可能做错了什么导致这种情况?应该调整哪些可配置项?

Would be great if someone could explain what Spark is doing when it spits out this message. What does this message mean? What could the user be doing wrong to cause this? What configurables should be tuned?

真的很难调试,因为它没有OOM,没有给出ST,它只是坐着坐着坐着.

It's really hard to debug because it doesn't OOM, it doesn't give an ST, it just sits and sits and sits.

Spark至少早在1.0.0以来就一直存在此问题,而Spark 1.5.0仍在继续

This has been an issue from Spark at least as far back as 1.0.0 and is still ongoing with Spark 1.5.0

推荐答案

基于

Based on this thread more recent versions of spark have gotten better at shuffling (and reporting errors if it fails anyway). Also, the following tips were mentioned:

这很有可能是因为序列化的地图输出位置缓冲区 超过了akka的框架大小.请尝试设置"spark.akka.frameSize" (默认10 MB)设置为更高的数字,例如64或128.

This is very likely because the serialized map output locations buffer exceeds the akka frame size. Please try setting "spark.akka.frameSize" (default 10 MB) to some higher number, like 64 or 128.

在最新版本的Spark中,这将引发更好的错误,因为 值多少钱.

In the newest version of Spark, this would throw a better error, for what it's worth.

可能的解决方法:

如果groupByKey中的密钥分布偏斜(某些 键比其他键更常出现),您应该考虑修改 您的工作尽可能使用reduceByKey.

If the distribution of the keys in your groupByKey is skewed (some keys appear way more often than others) you should consider modifying your job to use reduceByKey instead wherever possible.

还有侧轨:

通过为每个执行者分配一个核心,为我解决了这个问题.

The issue was fixed for me by allocating just one core per executor.

也许您的执行者内存配置应按执行者核心划分

maybe your executor-memory config should be divided by executor-cores

这篇关于使用“请求发送地图输出位置以随机播放"的Spark Indefinite Waiting的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆