为不同的工作程序节点设置不同的执行程序内存限制 [英] Set different executor memory limits for different worker nodes

查看:55
本文介绍了为不同的工作程序节点设置不同的执行程序内存限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在独立部署模式下使用spark 1.5.2,并使用脚本启动.执行程序内存是通过conf/spark-defaults.conf中的"spark.executor.memory"设置的.这将为所有工作节点设置相同的内存限制.我想使它可以为不同的节点设置不同的限制.我该怎么办?

I'm using spark 1.5.2 in standalone deployment mode and launching using scripts. The executor memory is set via 'spark.executor.memory' in conf/spark-defaults.conf. This sets the same memory limit for all the worker nodes. I would like to make it such one can set a different limit for different nodes. How can I do that?

(Spark 1.5.2,ubuntu 14.04)

(Spark 1.5.2, ubuntu 14.04)

谢谢

推荐答案

我不认为有任何方法可以使执行器大小不同.您可以限制工作程序节点的大小,但这实际上限制了工作程序可以在该盒子上分配的内存总量.

I don't believe there is any way to have heterogeneously sized executors. You can limit the sizes of the worker nodes, but this essentially limits the total amount of memory that the worker can allocate on that box.

在运行应用程序时,只能在应用程序级别指定spark.executor.memory,该内存要求每个工作人员每个执行者需要那么多的内存.

When you're running your application, you can only specify the spark.executor.memory at the application level which requests that much memory, per executor per worker.

如果您有大小不同的盒子,则可以将SPARK_WORKER_MEMORY设置为较小的数量,然后在较大的盒子上设置SPARK_WORKER_INSTANCES = 2,在较小的盒子上设置SPARK_WORKER_INSTANCES = 1.在此示例中,我们假设您的大盒子是小盒子的两倍.然后,通过在较大的盒子上使用两倍的执行程序,您将最终使用较大的盒子上的所有内存和较小的盒子上的所有内存.

If you're in a situation where you have heterogeneously sized boxes, what you can do is set SPARK_WORKER_MEMORY to be the smaller amount and then set SPARK_WORKER_INSTANCES = 2 on the larger boxes and SPARK_WORKER_INSTANCES = 1 on the smaller box. In this example we assume your larger boxes are twice as large as the smaller boxes. Then you'll end up using all of the memory on the larger boxes and all of the memory on the smaller boxes, by having twice as many executors on the larger boxes.

这篇关于为不同的工作程序节点设置不同的执行程序内存限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆