在 Flink 中,我可以在同一个槽中拥有一个运算符的多个子任务吗? [英] Can I have multiple subtasks of an operator in the same slot, in Flink?

查看:28
本文介绍了在 Flink 中,我可以在同一个槽中拥有一个运算符的多个子任务吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我这几天在摸索Apache Flink,对Task Slot的概念有些疑惑.虽然已经被问到了几个问题,但有一点我不明白.

I have been exploring Apache Flink for a few days, and I have some doubts about the concept of Task Slot. Although several questions have been asked about it, there is a point I don't get.

我正在使用一个玩具应用程序进行测试,运行本地集群.我已禁用操作员链接

I am using a toy application for testing, running a local cluster. I have disabled operator chaining

我从文档中知道插槽允许内存隔离而不是 CPU 隔离.阅读文档,似乎 Task Slot 是一个 Java 线程.

I know from docs that slots allow for memory isolation and not CPU isolation. Reading the docs, it seems that a Task Slot is a Java thread.

1) 当我使用 parallelism=1 部署我的应用程序时,所有操作员的子任务都部署在同一个槽中.但是,如果我从 AbstractStreamOperatoropen() 方法打印当前线程 ID,我会看到不同子任务的不同 ID.那么,它们不是共享同一个线程(即插槽?).

1) When I deploy my application with parallelism=1, all the operators' subtasks are deployed in the same slot. However, if I print the current thread ID from the open() method of AbstractStreamOperator, I see different IDs for different subtasks. So, aren't they sharing the same thread (i.e., the slot?).

2) 如果我将并行度从 1 更改为 3,我需要 3 个插槽才能正确重新部署应用程序.文档确认插槽数限制了我可以拥有的并行度.但是为什么我可以在同一个槽里有不同算子的子任务,而在同一个槽里不能有同一个算子的子任务?

2) If I change the parallelism from 1 to 3, I need 3 slots in order for the application to be re-deployed correctly. Documentation confirms that the number of slots limits the parallelism I can have. But why can I have subtasks of different operators in the same slot, while I cannot have subtasks of the same operator in the same slot?

感谢您的解释!

推荐答案

槽的想法是将可用资源分成更小的部分.可用的托管内存均匀分布在所有插槽中.CPU 周期和 JVM 堆内存未正确隔离 wrt 插槽.

The idea of slots is to slice the available resources up into smaller parts. The available managed memory is evenly distributed among all slots. CPU cycles and JVM heap memory are not properly isolated wrt slots.

在每个插槽中,您可以部署一个或多个任务.Flink Task 由专用线程执行.因此,如果您部署了多个 Tasks,您可以在同一个插槽中运行多个线程.

In each slot you can deploy one or more Tasks. A Flink Task is executed by a dedicated thread. Thus, you can have multiple threads running in the same slot if you have multiple Tasks deployed to it.

Task 表示单个 Flink 算子或多个算子(如果它们是可链接的)的并行实例.链接并不总是可行或需要的,但如果应用它,它将融合运算符,以便它们由相同的 Task 线程执行.这通常更有效,因为上下文切换更少,并且无需将记录移交给不同的线程.

A Task represents a parallel instance of a single Flink operator or of multiple operators if they are chainable. Chaining is not always possible or desired but if applied it will fuse operators so that they are executed by the same Task thread. This is usually more efficient since there are fewer context switches and no handing over of records to a different thread.

为了提高资源利用率(特别是对于资源很少的Tasks),并且为了更容易地推理出运行一个Flink程序需要多少槽,Flink支持槽共享.槽位共享意味着不同算子的并行实例可以部署到同一个槽位.由于这个特性,Flink 尽可能地创建不同算子的管道,并将它们部署到同一个槽中.这也有很好的效果,您可以增加生产者与其各自消费者的共存.由于这个特性,用户知道他们只需要提供与一个拓扑的所有算子的最大并行度一样多的槽.

In order to improve resource utilization (especially for Tasks which need little resources) and to make the reasoning about how many slots you need to run a Flink program easier, Flink supports slot sharing. Slot sharing means that parallel instances of different operators can be deployed to the same slot. Due to this feature, Flink creates as long pipelines of different operators as possible and deploys them to the same slot. This has also the nice effect that you increase co-location of producers with their respective consumers. Due to this feature, users know that they only need to provide as many slots as the maximum parallelism of all operators of ones topology.

但是,由于您仍然希望将一个算子的并行实例分布在所有可用的TaskExecutors 上,因此 Flink 不支持将同一算子的并行实例部署到同一个 slot.如果你想这样做,那么你应该简单地将相应运算符的并行度减少到 1.

However, since you still want to distribute parallel instances of an operator across all available TaskExecutors, Flink does not support to deploy parallel instances of the same operator to the same slot. If you want to do this, then you should simply reduce the parallelism of the respective operator to 1.

这篇关于在 Flink 中,我可以在同一个槽中拥有一个运算符的多个子任务吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆