Apache的Hadoop的纱线 - 内核未得到充分利用 [英] Apache Hadoop Yarn - Underutilization of cores

查看:330
本文介绍了Apache的Hadoop的纱线 - 内核未得到充分利用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不管我有多么的设置在鼓捣纱的site.xml 即使用所有的下列选项中

No matter how much I tinker with the settings in yarn-site.xml i.e using all of the below options

yarn.scheduler.minimum-allocation-vcores
yarn.nodemanager.resource.memory-mb
yarn.nodemanager.resource.cpu-vcores
yarn.scheduler.maximum-allocation-mb
yarn.scheduler.maximum-allocation-vcores

我只是还是不能让我的应用程序即星火利用所有群集上的核心。火花执行人似乎正确地占用了所有可用内存,但每个执行人只是不断采取单核心并没有更多的。

i just still cannot get my application i.e Spark to utilize all the cores on the cluster. The spark executors seem to be correctly taking up all the available memory, but each executor just keeps taking a single core and no more.

下面是配置的火花defaults.conf

spark.executor.cores                    3
spark.executor.memory                   5100m
spark.yarn.executor.memoryOverhead      800
spark.driver.memory                     2g
spark.yarn.driver.memoryOverhead        400
spark.executor.instances                28
spark.reducer.maxMbInFlight             120
spark.shuffle.file.buffer.kb            200

注意 spark.executor.cores 设置为3,但它不工作。
我如何解决这个问题?

Notice that spark.executor.cores is set to 3, but it doesn't work. How do i fix this?

推荐答案

问题不在于与纱的site.xml 火花默认.conf文件,但实际上与分配内核的遗嘱执行人或在马preduce作业的情况下,该映射器/减速器资源计算器。

The problem lies not with yarn-site.xml or spark-defaults.conf but actually with the resource calculator that assigns the cores to the executors or in the case of MapReduce jobs, to the Mappers/Reducers.

默认资源计算器即 org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator 使用分配容器和CPU调度唯一的记忆信息不默认启用。使用这两种内存以及CPU,资源计算器需要在<$ C $更改为 org.apache.hadoop.yarn.util.resource.DominantResourceCalculator C>能力scheduler.xml 文件。

The default resource calculator i.e org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator uses only memory information for allocating containers and CPU scheduling is not enabled by default. To use both memory as well as the CPU, the resource calculator needs to be changed to org.apache.hadoop.yarn.util.resource.DominantResourceCalculator in the capacity-scheduler.xml file.

下面就是需要改变。

<property>
    <name>yarn.scheduler.capacity.resource-calculator</name>
    <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>

这篇关于Apache的Hadoop的纱线 - 内核未得到充分利用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆