我可以在Flink的同一插槽中拥有一个操作员的多个子任务吗? [英] Can I have multiple subtasks of an operator in the same slot, in Flink?

查看:422
本文介绍了我可以在Flink的同一插槽中拥有一个操作员的多个子任务吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

几天来我一直在探索Apache Flink,并且对Task Slot的概念有些疑问.尽管有人问了几个问题,但我还是没有道理.

I have been exploring Apache Flink for a few days, and I have some doubts about the concept of Task Slot. Although several questions have been asked about it, there is a point I don't get.

我正在使用玩具应用程序进行测试,并运行本地集群.我已禁用操作员链接

I am using a toy application for testing, running a local cluster. I have disabled operator chaining

我从文档中知道插槽可以实现内存隔离而不是CPU隔离.阅读文档,似乎任务槽是Java线程.

I know from docs that slots allow for memory isolation and not CPU isolation. Reading the docs, it seems that a Task Slot is a Java thread.

1)当我以并行度= 1部署我的应用程序时,所有操作员的子任务都部署在同一插槽中.但是,如果我从AbstractStreamOperatoropen()方法打印当前线程ID,则对于不同的子任务会看到不同的ID.因此,它们不是共享相同的线程(即插槽吗?).

1) When I deploy my application with parallelism=1, all the operators' subtasks are deployed in the same slot. However, if I print the current thread ID from the open() method of AbstractStreamOperator, I see different IDs for different subtasks. So, aren't they sharing the same thread (i.e., the slot?).

2)如果将并行度从1更改为3,则需要3个插槽,以便正确重新部署应用程序.文档确认插槽的数量限制了我可以拥有的并行度.但是,为什么我不能在同一插槽中拥有不同运算符的子任务,而又不能在同一插槽中具有同一运算符的子任务呢?

2) If I change the parallelism from 1 to 3, I need 3 slots in order for the application to be re-deployed correctly. Documentation confirms that the number of slots limits the parallelism I can have. But why can I have subtasks of different operators in the same slot, while I cannot have subtasks of the same operator in the same slot?

谢谢您的解释!

推荐答案

插槽的想法是将可用资源切成较小的部分.可用的受管内存在所有插槽之间平均分配. CPU周期和JVM堆内存未正确隔离wrt插槽.

The idea of slots is to slice the available resources up into smaller parts. The available managed memory is evenly distributed among all slots. CPU cycles and JVM heap memory are not properly isolated wrt slots.

在每个插槽中,您可以部署一个或多个Tasks. Flink Task由专用线程执行.因此,如果有多个Tasks部署到同一插槽,则可以在同一插槽中运行多个线程.

In each slot you can deploy one or more Tasks. A Flink Task is executed by a dedicated thread. Thus, you can have multiple threads running in the same slot if you have multiple Tasks deployed to it.

A Task表示单个Flink运算符或多个可链接的运算符的并行实例.链接并非总是可能或不需要的,但是如果应用链接,它将使运算符融合在一起,从而使它们由同一Task线程执行.通常这样会更有效,因为上下文切换较少,而且没有将记录移交给其他线程的方法.

A Task represents a parallel instance of a single Flink operator or of multiple operators if they are chainable. Chaining is not always possible or desired but if applied it will fuse operators so that they are executed by the same Task thread. This is usually more efficient since there are fewer context switches and no handing over of records to a different thread.

为了提高资源利用率(尤其是对于需要很少资源的Tasks)并简化运行Flink程序所需的插槽数量的推理,Flink支持插槽共享.插槽共享意味着可以将不同操作员的并行实例部署到同一插槽.由于此功能,Flink会创建尽可能长的不同操作员管道,并将它们部署到同一插槽.这还具有很好的效果,您可以增加生产者与他们各自的消费者在同一地点的位置.由于此功能,用户知道他们只需要提供与一个拓扑的所有运算符的最大并行度一样多的插槽即可.

In order to improve resource utilization (especially for Tasks which need little resources) and to make the reasoning about how many slots you need to run a Flink program easier, Flink supports slot sharing. Slot sharing means that parallel instances of different operators can be deployed to the same slot. Due to this feature, Flink creates as long pipelines of different operators as possible and deploys them to the same slot. This has also the nice effect that you increase co-location of producers with their respective consumers. Due to this feature, users know that they only need to provide as many slots as the maximum parallelism of all operators of ones topology.

但是,由于您仍想在所有可用的TaskExecutors中分配运算符的并行实例,因此Flink不支持将同一运算符的并行实例部署到同一插槽.如果要执行此操作,则只需将各个运算符的并行度降低为1.

However, since you still want to distribute parallel instances of an operator across all available TaskExecutors, Flink does not support to deploy parallel instances of the same operator to the same slot. If you want to do this, then you should simply reduce the parallelism of the respective operator to 1.

这篇关于我可以在Flink的同一插槽中拥有一个操作员的多个子任务吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆