将Flink运算符平均分配给任务管理器 [英] Distribute a Flink operator evenly across taskmanagers
问题描述
我正在对15台计算机的裸机集群上的Flink流应用程序进行原型设计.我正在使用带有90个任务槽(15x6)的yarn模式.
I'm prototyping a Flink streaming application on a bare-metal cluster of 15 machines. I'm using yarn-mode with 90 task slots (15x6).
该应用程序从单个Kafka主题读取数据.Kafka主题有15个分区,因此我也将源运算符的并行度设置为15.但是,我发现Flink在某些情况下会将2-4个消费者任务实例分配给同一任务管理器.这导致某些节点成为网络绑定(Kafka主题正在提供大量数据,而这些机器仅具有1G NIC)和整个数据流中的瓶颈.
The app reads data from a single Kafka topic. The Kafka topic has 15 partitions, so I set the parallelism of the source operator to 15 as well. However, I found that Flink in some cases assigns 2-4 instances of the consumer task to the same taskmanager. This causes certain nodes to become network-bound (the Kafka topic is serving high volume of data and the machines only have 1G NICs) and bottlenecks in the entire data flow.
是否有一种方法可以强制"或以其他方式指示Flink在所有任务管理器(可能是循环调度)之间平均分配任务?如果没有,是否有办法手动将任务分配给特定的任务管理器插槽?
Is there a way to "force" or otherwise instruct Flink to distribute a task evenly across all taskmanagers, perhaps round robin? And if not, is there a way to manually assign tasks to specific taskmanager slots?
推荐答案
Flink不允许手动分配任务插槽,因为在进行故障处理时,它可以将任务分配给其余的任务管理器.
Flink does not allow manually assign task slots as in case of failure handling, it can distribute the task to remaining task managers.
但是,您可以通过在 flink-conf.yaml
中设置 cluster.evenly-spread-out-slots:true
来平均分配工作负载.这适用于Flink> = 1.9.2.
However, you can distribute the workload evenly by setting cluster.evenly-spread-out-slots: true
in flink-conf.yaml
.
This works for Flink >= 1.9.2.
要使其正常工作,您可能还需要设置: taskmanager.numberOfTaskSlots
等于每台计算机可用的CPU数量,并且 parallelism.default
等于集群中CPU的总数.
To make it work, you may also have to set:
taskmanager.numberOfTaskSlots
equal to the number of available CPUs per machine, and
parallelism.default
equal to the the total number of CPUs in the cluster.
这篇关于将Flink运算符平均分配给任务管理器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!