什么是spark.driver.maxResultSize? [英] What is spark.driver.maxResultSize?

查看：1397 发布时间：2020/9/4 1:06:02 apache-spark configuration driver communication distributed-computing

本文介绍了什么是spark.driver.maxResultSize?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

每个分区的所有分区的序列化结果的总大小的限制触发动作(例如收集).至少应为1M，否则应为0 无限.如果总大小超过此限制，作业将被中止. 上限过高可能会导致驱动程序内存不足错误(取决于关于spark.driver.memory和JVM中对象的内存开销).环境适当的限制可以保护驱动程序免受内存不足错误的影响.

Limit of total size of serialized results of all partitions for each Spark action (e.g. collect). Should be at least 1M, or 0 for unlimited. Jobs will be aborted if the total size is above this limit. Having a high limit may cause out-of-memory errors in driver (depends on spark.driver.memory and memory overhead of objects in JVM). Setting a proper limit can protect the driver from out-of-memory errors.

此属性的作用是什么?我的意思是，起初(因为我不为因内存不足错误而失败的工作而奋斗)，我认为我应该增加它.

What does this attribute do exactly? I mean at first (since I am not battling with a job that fails due to out of memory errors) I thought I should increase that.

经过深思熟虑，似乎该属性定义了工作人员可以发送给驱动程序的结果的最大大小，因此将其保留为默认值(1G)将是保护驱动程序的最佳方法.

On second thought, it seems that this attribute defines the max size of the result a worker can send to the driver, so leaving it at the default (1G) would be the best approach to protect the driver..

但是在这种情况下会发生，工作人员将不得不发送更多消息，所以开销仅仅是工作会变慢?

But will happen on this case, the worker will have to send more messages, so the overhead will be just that the job will be slower?

如果我理解正确，假设某个工作人员想要向驱动程序发送4G数据，那么具有spark.driver.maxResultSize=1G，将导致该工作人员发送4条消息(而不是无限制spark.driver.maxResultSize的1条消息).如果是这样，那么增加该属性以保护我的驱动程序免遭Yarn的暗杀应该是错误的.

If I understand correctly, assuming that a worker wants to send 4G of data to the driver, then having spark.driver.maxResultSize=1G, will cause the worker to send 4 messages (instead of 1 with unlimited spark.driver.maxResultSize). If so, then increasing that attribute to protect my driver from being assassinated from Yarn should be wrong.

但是上面的问题仍然存在..我的意思是，如果将其设置为1M(最小)，那将是最具保护性的方法吗?

But still the question above remains..I mean what if I set it to 1M (the minimum), will it be the most protective approach?

什么是spark.driver.maxResultSize? [英] What is spark.driver.maxResultSize?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

什么是spark.driver.maxResultSize? [英] What is spark.driver.maxResultSize?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭