Apache Spark:设置执行程序实例不会更改执行程序 [英] Apache Spark: setting executor instances does not change the executors
问题描述
我有一个 Apache Spark 应用程序在集群模式下运行在 YARN 集群上(spark 在这个集群上有 3 个节点).
I have an Apache Spark application running on a YARN cluster (spark has 3 nodes on this cluster) on cluster mode.
当应用程序运行时,Spark-UI 显示 2 个执行程序(每个运行在不同的节点上)和驱动程序正在第三个节点上运行.我希望应用程序使用更多的执行程序,因此我尝试将参数 --num-executors 添加到 Spark-submit 并将其设置为 6.
When the application is running the Spark-UI shows that 2 executors (each running on a different node) and the driver are running on the third node. I want the application to use more executors so I tried adding the argument --num-executors to Spark-submit and set it to 6.
spark-submit --driver-memory 3G --num-executors 6 --class main.Application --executor-memory 11G --master yarn-cluster myJar.jar
然而,执行者的数量仍然是 2.
However, the number of executors remains 2.
在 spark UI 上,我可以看到参数 spark.executor.instances 是 6,正如我预期的那样,但不知何故仍然只有 2 个执行程序.
On spark UI I can see that the parameter spark.executor.instances is 6, just as I intended, and somehow there are still only 2 executors.
我什至尝试从代码中设置这个参数
I even tried setting this parameter from the code
sparkConf.set("spark.executor.instances", "6")
再次,我可以看到参数设置为 6,但仍然只有 2 个执行程序.
Again, I can see that the parameter was set to 6, but still there are only 2 executors.
有谁知道为什么我不能增加执行者的数量?
Does anyone know why I couldn't increase the number of my executors?
yarn.nodemanager.resource.memory-mb 在yarn-site.xml 中为 12g
yarn.nodemanager.resource.memory-mb is 12g in yarn-site.xml
推荐答案
增加 yarn.nodemanager.resource.memory-mb
在 yarn-site.xml
对于每个节点 12g,您只能启动驱动程序 (3g) 和 2 个执行程序 (11g).
With 12g per node you can only launch driver(3g) and 2 executors(11g).
Node1 - 驱动程序 3g(+7% 开销)
Node1 - driver 3g (+7% overhead)
Node2 - executor1 11g(+7% 开销)
Node2 - executor1 11g (+7% overhead)
Node3 - executor2 11g(+7% 开销)
Node3 - executor2 11g (+7% overhead)
现在您正在请求 11g 的 executor3 并且没有节点具有 11g 可用内存.
now you are requesting for executor3 of 11g and no node has 11g memory available.
对于 7% 的开销,请参阅 https://spark.apache.org/docs/1.2.0/running-on-yarn.html
for 7% overhead refer spark.yarn.executor.memoryOverhead and spark.yarn.driver.memoryOverhead in https://spark.apache.org/docs/1.2.0/running-on-yarn.html
这篇关于Apache Spark:设置执行程序实例不会更改执行程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!