Apache Spark:设置执行程序实例不会更改执行程序 [英] Apache Spark: setting executor instances does not change the executors

查看:41
本文介绍了Apache Spark:设置执行程序实例不会更改执行程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 Apache Spark 应用程序在集群模式下运行在 YARN 集群上(spark 在这个集群上有 3 个节点).

I have an Apache Spark application running on a YARN cluster (spark has 3 nodes on this cluster) on cluster mode.

当应用程序运行时,Spark-UI 显示 2 个执行程序(每个运行在不同的节点上)和驱动程序正在第三个节点上运行.我希望应用程序使用更多的执行程序,因此我尝试将参数 --num-executors 添加到 Spark-submit 并将其设置为 6.

When the application is running the Spark-UI shows that 2 executors (each running on a different node) and the driver are running on the third node. I want the application to use more executors so I tried adding the argument --num-executors to Spark-submit and set it to 6.

spark-submit --driver-memory 3G --num-executors 6 --class main.Application --executor-memory 11G --master yarn-cluster myJar.jar <arg2><arg3>...

然而,执行者的数量仍然是 2.

However, the number of executors remains 2.

在 spark UI 上,我可以看到参数 spark.executor.instances 是 6,正如我预期的那样,但不知何故仍然只有 2 个执行程序.

On spark UI I can see that the parameter spark.executor.instances is 6, just as I intended, and somehow there are still only 2 executors.

我什至尝试从代码中设置这个参数

I even tried setting this parameter from the code

sparkConf.set("spark.executor.instances", "6")

再次,我可以看到参数设置为 6,但仍然只有 2 个执行程序.

Again, I can see that the parameter was set to 6, but still there are only 2 executors.

有谁知道为什么我不能增加执行者的数量?

Does anyone know why I couldn't increase the number of my executors?

yarn.nodemanager.resource.memory-mb 在yarn-site.xml 中为 12g

yarn.nodemanager.resource.memory-mb is 12g in yarn-site.xml

推荐答案

增加 yarn.nodemanager.resource.memory-mbyarn-site.xml

对于每个节点 12g,您只能启动驱动程序 (3g) 和 2 个执行程序 (11g).

With 12g per node you can only launch driver(3g) and 2 executors(11g).

Node1 - 驱动程序 3g(+7% 开销)

Node1 - driver 3g (+7% overhead)

Node2 - executor1 11g(+7% 开销)

Node2 - executor1 11g (+7% overhead)

Node3 - executor2 11g(+7% 开销)

Node3 - executor2 11g (+7% overhead)

现在您正在请求 11g 的 executor3 并且没有节点具有 11g 可用内存.

now you are requesting for executor3 of 11g and no node has 11g memory available.

对于 7% 的开销,请参阅 https://spark.apache.org/docs/1.2.0/running-on-yarn.html

for 7% overhead refer spark.yarn.executor.memoryOverhead and spark.yarn.driver.memoryOverhead in https://spark.apache.org/docs/1.2.0/running-on-yarn.html

这篇关于Apache Spark:设置执行程序实例不会更改执行程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆