Databricks Spark:java.lang.OutOfMemoryError:超出了GC开销限制i [英] Databricks Spark: java.lang.OutOfMemoryError: GC overhead limit exceeded i

查看:86
本文介绍了Databricks Spark:java.lang.OutOfMemoryError:超出了GC开销限制i的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Databricks集群中执行Spark作业.我正在通过Azure数据工厂管道触发作业,并且该作业以15分钟的间隔执行,因此successful execution of three or four times之后它会失败并抛出异常"java.lang.OutOfMemoryError: GC overhead limit exceeded". 尽管对于上述问题有很多答案,但是在大多数情况下,他们的工作没有运行,但是在我的情况下,在成功执行某些先前的工作后,它却失败了. 我的数据大小仅小于20 MB.

I am executing a Spark job in Databricks cluster. I am triggering the job via a Azure Data Factory pipeline and it execute at 15 minute interval so after the successful execution of three or four times it is getting failed and throwing with the exception "java.lang.OutOfMemoryError: GC overhead limit exceeded". Though there are many answer with for the above said question but in most of the cases their jobs are not running but in my cases it is getting failed after successful execution of some previous jobs. My data size is less than 20 MB only.

我的集群配置是:

所以我的问题是我应该在服务器配置中进行哪些更改.如果问题出在我的代码上,那么为什么大多数时间它都成功了.请提出建议并向我提出解决方案.

So the my question is what changes I should make in the server configuration. If the issue is coming from my code then why it is getting succeeded most of the time. Please advise and suggest me the solution.

推荐答案

您可以尝试增加驱动程序节点的内存.

You may try increasing memory of driver node.

这篇关于Databricks Spark:java.lang.OutOfMemoryError:超出了GC开销限制i的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆