为什么Flink容器的vcore大小始终为1 [英] Why flink container vcore size is always 1

查看:541
本文介绍了为什么Flink容器的vcore大小始终为1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在纱线上运行flink(更准确地说是在AWS EMR纱线集群中).

I am running flink on yarn(more precisely in AWS EMR yarn cluster).

我阅读了flink文档和源代码,这些文档和源代码默认为每个任务管理器容器使用,当从yarn请求资源时,flink将请求每个任务管理器的插槽数作为vcore数. 而且我还从源代码中确认:

I read flink document and source code that by default for each task manager container, flink will request the number of slot per task manager as the number of vcores when request resource from yarn. And I also confirmed from the source code:

// Resource requirements for worker containers
            int taskManagerSlots = taskManagerParameters.numSlots();
            int vcores = config.getInteger(ConfigConstants.YARN_VCORES, 
Math.max(taskManagerSlots, 1));
            Resource capability = Resource.newInstance(containerMemorySizeMB, 
vcores);

            resourceManagerClient.addContainerRequest(
                new AMRMClient.ContainerRequest(capability, null, null, 
priority));

当我使用 -yn 1 -ys 3 启动flink时,我假设yarn将为唯一的任务管理器容器分配3个vcore,但是当我检查yarn中每个容器的vcore数量时资源管理器Web ui中,我总是看到vcore的数量为1.从yarn资源管理器日志中也看到vcore为1.

When I use -yn 1 -ys 3 to start flink, I assume yarn will allocate 3 vcores for the only task manager container, but when I checked the number of vcores for each container from yarn resource manager web ui, I always see the number of vcores is 1. I also see vcore to be 1 from yarn resource manager logs.

我将flink源代码调试到下面粘贴的行中,并且看到 vcores 的值为 3 . 这真的使我感到困惑,任何人都可以帮我澄清一下,谢谢.

I debugged the flink source code to the line I pasted below, and I saw value of vcores is 3. This is really confuse me, can anyone help to clarify for me, thanks.

推荐答案

Kien Truong的答案

An answer from Kien Truong

您必须在YARN中启用 CPU调度,否则,它始终显示每个容器仅分配了1个CPU, 无论尝试分配多少Flink.因此,您应该在 capacity-scheduler.xml 中添加(编辑)以下属性:

You have to enable CPU scheduling in YARN, otherwise, it always shows that only 1 CPU is allocated for each container, regardless of how many Flink try to allocate. So you should add (edit) the following property in capacity-scheduler.xml:

<property>
 <name>yarn.scheduler.capacity.resource-calculator</name>
 <!-- <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value> -->
 <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>

TaskManager的内存例如为1400MB,但是Flink为堆外内存保留了一些内存,因此实​​际堆大小较小.

TaskManager memory is, for example, 1400MB, but Flink reserves some amount for off-heap memory, so the actual heap size is smaller.

这由2个设置控制:

containerized.heap-cutoff-min: default 600MB

containerized.heap-cutoff-ratio: default 15% of TM's memory

这就是为什么TM的堆大小限制为〜800MB(1400-600)

That's why your TM's heap size is limitted to ~800MB (1400 - 600)

此致

肯恩

这篇关于为什么Flink容器的vcore大小始终为1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆