集群节点上的 Flink 作业分布 [英] Flink job distribution over cluster nodes

查看:41
本文介绍了集群节点上的 Flink 作业分布的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有 4 个作业在 3 个节点上运行,每个节点有 4 个插槽,

We have 4 jobs that are running over 3 nodes with 4 slots per each,

在 Flink 1.3.2 上,作业平均分配给每个节点.升级到 flink 1.5 后,每个作业都在单个节点上运行(如果没有剩余插槽,则转移到另一个节点)

On Flink 1.3.2 the jobs were evenly distributed per node. After upgrading to flink 1.5 , each job is running on a single node (with a carry over to another if there are no slots left)

有没有办法恢复到均匀分布?作业不是按负载均匀分布的,这导致某些节点比其他节点更努力地工作.

Is there a way to return to an even distribution? The jobs are not evenly by load which cause some nodes to work harder than other.

推荐答案

我从 flink 邮件列表收到的答复Re:Flink 1.5 集群节点上的作业分配

An answer I received from flink mailing list Re: Flink 1.5 job distribution over cluster nodes

沙查尔,

在 Flink 1.5 中,我们增加了资源弹性.这意味着 Flink 是现在能够在集群管理框架上分配新容器像 Yarn 或 Mesos.由于这些变化(也适用于独立模式),Flink 不再考虑固定的一组TaskManagers 因为如果需要它会启动新的容器(不会在独立模式下工作).因此,系统很难做出关于分布属于单个作业的插槽的任何决定跨多个 TM.考虑到这一点就更难了像你这样的一些工作可能会从这种策略中受益,而其他人将受益于共同定位其插槽.它变得更多如果您想将 wrt 调度到多个作业,这会很复杂系统不完全了解,因为它们是依次提交.因此,Flink 目前假设插槽请求可以由任何 TaskManager 完成.

with Flink 1.5 we added resource elasticity. This means that Flink is now able to allocate new containers on a cluster management framework like Yarn or Mesos. Due to these changes (which also apply to the standalone mode), Flink no longer reasons about a fixed set of TaskManagers because if needed it will start new containers (does not work in standalone mode). Therefore, it is hard for the system to make any decisions about spreading slots belonging to a single job out across multiple TMs. It gets even harder when you consider that some jobs like yours might benefit from such a strategy whereas others would benefit from co-locating its slots. It gets even more complicated if you want to do scheduling wrt to multiple jobs which the system does not have full knowledge about because they are submitted sequentially. Therefore, Flink currently assumes that slots requests can be fulfilled by any TaskManager.

干杯,直到

这篇关于集群节点上的 Flink 作业分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆