在Flink中可以有多名工人的全球状态吗? [英] Is global state with multiple workers possible in Flink?

查看:73
本文介绍了在Flink中可以有多名工人的全球状态吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Flink文档中的任何地方,我都看到状态是map函数和worker各自的.在独立方法中,这似乎很强大,但是如果Flink在集群中运行怎么办? Flink是否可以处理所有员工都可以添加数据并查询数据的全局状态?

Everywhere in Flink docs I see that a state is individual to a map function and a worker. This seems to be powerful in a standalone approach, but what if Flink runs in a cluster ? Can Flink handle a global state where all workers could add data and query it ?

摘自Flink关于状态的文章:

From Flink article on states :

要在此设置下实现高吞吐量和低延迟,必须最小化任务之间的网络通信.在Flink中,用于流处理的网络通信仅发生在作业操作员图中的逻辑边缘(垂直),因此流数据可以从上游操作员传输到下游操作员.

For high throughput and low latency in this setting, network communications among tasks must be minimized. In Flink, network communication for stream processing only happens along the logical edges in the job’s operator graph (vertically), so that the stream data can be transferred from upstream to downstream operators.

但是,运算符的并行实例之间(水平)没有通信.为了避免这种网络通信,数据本地性是Flink中的关键原则,并且强烈影响状态的存储和访问方式.

However, there is no communication between the parallel instances of an operator (horizontally). To avoid such network communication, data locality is a key principle in Flink and strongly affects how state is stored and accessed.

推荐答案

我认为Flink仅支持

I think that Flink only supports state on operators and state on Keyed streams, if you need some kind of global state, you have to store and recover data into some kind of database/file system/shared memory and mix that data with your stream.

无论如何,以我的经验,拥有良好的处理管道设计并以正确的方式对数据进行分区,在大多数情况下,您应该能够应用分而治之算法或MapReduce策略来存档您的需求

Anyways, in my experiece, with a good processing pipeline design and partitioning your data in the right way, in most cases you should be able to apply divide and conquer algorithms or MapReduce strategies to archive your needs

如果在系统中引入某种全局状态,则该全局状态可能是一个很大的瓶颈.因此,请不惜一切代价避免它.

If you introduce in your system some kind of global state, that global state could be a great bottleneck. So try to avoid it at all cost.

这篇关于在Flink中可以有多名工人的全球状态吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆