Spark停留在删除广播变量(可能) [英] Spark stuck at removing broadcast variable (probably)

查看:147
本文介绍了Spark停留在删除广播变量(可能)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Spark 2.0.0-预览

我们有一个使用相当大的广播变量的应用程序.我们在大型EC2实例上运行此程序,因此部署处于客户端模式.广播变量是一个庞大的 Map [String,Array [String]] .

saveAsTextFile 的末尾,文件夹中的输出似乎是完整且正确的(除了 .crc 文件仍在那里),但是执行了火花提交过程似乎被删除了广播变量.卡住的日志如下所示: http://pastebin.com/wpTqvArY

在执行saveAsTextFile之后,我的上一次运行持续了12个小时-只是坐在那里.我在驱动程序进程上做了一个jstack,大多数线程都已停放: http://pastebin.com/E29JKVT7

完整的故事:

我们在Spark 1.5.0中使用了此代码,并且可以工作,但是随后数据发生了变化,并且某些东西不再适合Kryo的序列化缓冲区.增加它没有帮助,所以我不得不禁用KryoSerialiser.再次测试-挂死了.切换到2.0.0-预览版-似乎是同样的问题.

由于几乎没有CPU活动并且日志中没有输出,因此我还不确定会发生什么,但是输出没有像以前那样完成.

非常感谢您的帮助.

解决方案

我遇到了非常相似的问题.

我正在从Spark 1.6.1更新到2.0.1,并且完成后我的步骤挂起了.

最后,我设法通过在任务末尾添加 sparkContext.stop()来解决该问题.

不确定为什么需要它,但是它解决了我的问题.希望这可以帮助.

ps:这篇文章让我想起了 https://xkcd.com/979/

Spark 2.0.0-preview

We've got an app that uses a fairly big broadcast variable. We run this on a big EC2 instance, so deployment is in client-mode. Broadcasted variable is a massive Map[String, Array[String]].

At the end of saveAsTextFile, the output in the folder seems to be complete and correct (apart from .crc files still being there) BUT the spark-submit process is stuck on, seemingly, removing the broadcast variable. The stuck logs look like this: http://pastebin.com/wpTqvArY

My last run lasted for 12 hours after after doing saveAsTextFile - just sitting there. I did a jstack on driver process, most threads are parked: http://pastebin.com/E29JKVT7

Full story:

We used this code with Spark 1.5.0 and it worked, but then the data changed and something stopped fitting into Kryo's serialisation buffer. Increasing it didn't help, so I had to disable the KryoSerialiser. Tested it again - it hanged. Switched to 2.0.0-preview - seems like the same issue.

I'm not quite sure what's even going on given that there's almost no CPU activity and no output in the logs, yet the output is not finalised like it used to before.

Would appreciate any help, thanks.

解决方案

I had a very similar issue.

I was updating from spark 1.6.1 to 2.0.1 and my steps were hanging after completion.

In the end, I managed to solve it by adding a sparkContext.stop() at the end of the task.

Not sure why this is needed it but it solved my issue. Hope this helps.

ps: this post reminds me of this https://xkcd.com/979/

这篇关于Spark停留在删除广播变量(可能)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆