纱线资源管理器中未显示Spark执行器核心 [英] Spark executor cores not shown in yarn resource manager

查看:82
本文介绍了纱线资源管理器中未显示Spark执行器核心的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

纱线资源管理器未显示spark应用程序的总核.例如,如果我们提交的Spark作业有300个执行者,而executor-cores是3.那么理想情况下,spark拥有900个核心,但在纱线资源管理器中仅显示300个核心.

那么这仅仅是显示错误,还是Yarn没有看到其余600个内核?

环境:HDP2.2调度程序:容量调度器火花:1.4.1

解决方案

设置

yarn.scheduler.capacity.resource-calculator = org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

capacity-scheduler.xml

YARN运行的容器多于分配的内核,因为默认情况下此处了解更多有关DominantResourceCalculator的信息./p>

Yarn resource manager is not showing the total cores for the spark application. For example if we are submiting a spark job with 300 executors and executor-cores is 3. So ideally spark having 900 cores but in yarn resource manager only showing 300 cores.

So is this just a display error or is Yarn not seeing the rest of the 600 cores?

Environment: HDP2.2 Scheduler : capacity-scheduler Spark : 1.4.1

解决方案

Set

yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

in capacity-scheduler.xml

YARN is running more containers than allocated cores because by default DefaultResourceCalculator is used. It considers only memory.

public int computeAvailableContainers(Resource available, Resource required) {
// Only consider memory
return available.getMemory() / required.getMemory();
  }

Use DominantResourceCalculator, It uses both cpu and memory.

you can read more about DominantResourceCalculator here.

这篇关于纱线资源管理器中未显示Spark执行器核心的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆