如何控制有多少执行程序在yarn-client模式下运行? [英] How to control how many executors to run in yarn-client mode?

查看:86
本文介绍了如何控制有多少执行程序在yarn-client模式下运行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由5个节点组成的Hadoop集群,其中Spark在 yarn-client 模式下运行.

I have a Hadoop cluster of 5 nodes where Spark runs in yarn-client mode.

我使用-num-executors 作为执行者的数量.我最多可以得到20个执行者.即使我指定了更多的执行者,我也只能得到20个执行者.

I use --num-executors for the number of executors. The maximum number of executors I am able to get is 20. Even if I specify more, I get only 20 executors.

可以分配的执行程序数量是否有上限?是配置还是根据可用资源做出决定?

Is there any upper limit on the number of executors that can get allocated ? Is it a configuration or the decision is made on the basis of the resources available ?

推荐答案

显然,您的20个正在运行的执行程序消耗了所有可用内存.您可以尝试使用 spark.executor.memory 参数减少执行程序的内存,这应该为其他执行程序产生更多的空间.

Apparently your 20 running executors consume all available memory. You can try decreasing Executor memory with spark.executor.memory parameter, which should leave a bit more place for other executors to spawn.

此外,您确定正确设置执行程序编号吗?您可以通过查看环境"选项卡中的 spark.executor.instances 值,从Spark UI视图验证环境设置.

Also, are you sure that you correctly set the executors number? You can verify your environment settings from Spark UI view by looking at the spark.executor.instances value in the Environment tab.

编辑:正如Mateusz Dymczyk在评论中指出的那样,执行器数量有限不仅可能是由于RAM内存过度使用,还可能是CPU内核引起的.在这两种情况下,限制都来自资源管理器.

As Mateusz Dymczyk pointed out in comments, limited number of executors may not only be caused by overused RAM memory, but also by CPU cores. In both cases the limit comes from the resource manager.

这篇关于如何控制有多少执行程序在yarn-client模式下运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆