一个带有三个插槽的TaskManager是否与Apache Flink中三个带有一个插槽的TaskManager相同? [英] Is one TaskManager with three slots the same as three TaskManagers with one slot in Apache Flink

查看:154
本文介绍了一个带有三个插槽的TaskManager是否与Apache Flink中三个带有一个插槽的TaskManager相同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,在Flink中,JobManager可以根据需要将作业分配给具有多个插槽的多个TaskManager.例如,可以使用五个插槽为一个作业分配三个TaskManager.

In Flink, as my understanding, JobManager can assign a job to multiple TaskManagers with multiple slots if necessary. For example, one job can be assigned three TaskManagers, using five slots.

现在,我要执行一个带有三个插槽的TaskManager(TM),这三个插槽分配给3G RAM和一个CPU.

Now, saying that I execute one TaskManager(TM) with three slots, which is assigned to 3G RAM and one CPU.

这与执行三个TaskManager共享一个CPU完全相同吗?它们每个都分配给1 G RAM?

Is this totally the same as executing three TaskManagers, sharing one CPU, and each of them is assigned to 1 G RAM?

case 1
---------------
| 3G RAM      |
| one CPU     |
| three slots |
| TM          |
---------------

case 2
--------------------------------------------|
|              one CPU                      |
|  ------------  ------------ ------------  |
|  | 1G RAM   |  | 1G RAM   | | 1G RAM   |  |
|  | one slot |  | one slot | | one slot |  |
|  | TM       |  | TM       | | TM       |  |
|  ------------  ------------ ------------  |
--------------------------------------------|

推荐答案

在性能和操作上存在着两个方面的差异.

There are performance and operational differences that pull in both directions.

在非容器化环境中运行时,通过RocksDB状态后端,每台机器有一个具有多个插槽的TM是有意义的.这将使每TM的开销最小化.但是,每TM的开销并不那么大.

When running in non-containerized environments, with the RocksDB state backend, it can make sense to have a single TM per machine, with many slots. This will minimize the per-TM overhead. However, the per-TM overhead is not that significant.

另一方面,每个TM运行一个插槽可以提供一些有用的隔离,并减少垃圾回收的影响,这尤其与基于堆的状态后端有关.

On the other hand, running with one slot per TM provides some helpful isolation, and reduces the impact of garbage collection, which is particularly relevant with a heap-based state backend.

对于容器化部署,通常建议每个TM使用一个插槽,直到达到一定的规模为止,此时,您将希望通过为每个TM增加更多的插槽而不是更多的TM进行扩展.问题在于检查点协调器需要与每个TM(而不是每个插槽)协调,并且随着TM数量成百上千,这可能成为瓶颈.

With containerized deployments, it is generally recommended to go with one slot per TM until reaching some significant scale, at which point you will want to scale by adding more slots per TM rather than more TMs. The issue is that the checkpoint coordinator needs to coordinate with each TM (but not with each slot), and as the number of TMs gets into the hundreds or thousands, this can become a bottleneck.

这篇关于一个带有三个插槽的TaskManager是否与Apache Flink中三个带有一个插槽的TaskManager相同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆