为什么我的Flink窗口使用了这么多状态? [英] Why are my Flink windows using so much state?

查看:149
本文介绍了为什么我的Flink窗口使用了这么多状态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Flink作业的检查点越来越大.深入研究各个任务之后,键控窗口函数似乎是大部分大小的原因.我该如何减少呢?

The checkpoints for my Flink job are getting larger and larger. After drilling down into individual tasks, the keyed window function seems to be responsible for most of the size. How can I reduce this?

推荐答案

如果您在Windows中捆绑了很多状态,则有几种可能性:

If you have a lot of state tied up in windows, there are several possibilities:

  • 使用增量聚合(通过使用 reduce aggregate )可以大大减少您的存储需求.否则,每个事件都将被复制到分配给每个窗口的事件列表中.

  • Using incremental aggregation (by using reduce or aggregate) can dramatically reduce your storage requirements. Otherwise each event is being copied into the list of events assigned to each window.

如果要汇总多个时间范围(例如,每分钟和每10分钟),则可以层叠这些窗口,以便10分钟的窗口仅占用数分钟的窗口的输出,而不是每分钟事件.

If you are aggregating over multiple timeframes, e.g., every minute and every 10 minutes, you can cascade these windows, so that the 10 minute windows are only consuming the output of the minute-long windows, rather than every event.

如果使用的是滑动窗口,则会将每个事件分配给每个重叠的窗口.例如,如果您的窗口长2分钟并且滑动1秒,则每个事件都将复制到120个窗口中.增量和/或预聚合将在这里有所帮助(很多!),或者您可能想使用 KeyedProcessFunction 而不是窗口来优化状态足迹.

If you are using sliding windows, each event is being assigned to each of the overlapping windows. For example, if your windows are 2 minutes long and sliding by 1 second, each event is being copied into 120 windows. Incremental and/or pre-aggregation will help here (a lot!), or you may want to use a KeyedProcessFunction instead of a window in order to optimize your state footprint.

如果您有键计数窗口,则可能有一些键,这些键从未达到(或非常缓慢)所需的批处理大小,从而导致越来越多的部分批处理处于状态.您可以实现一个自定义的 Trigger ,该触发器除了基于计数的触发外还包含超时功能,以便最终处理这些部分批次.

If you have keyed count windows, you could have keys for which the requisite batch size is never (or only very slowly) reached, leading to more and more partial batches sitting around in state. You could implement a custom Trigger that incorporates a timeout in addition to the count-based triggering so that these partial batches are eventually processed.

如果在 ProcessWindowFunction 中使用 globalState ,则陈旧密钥的globalState将累积.您可以在globalState的状态描述符上使用状态TTL.注意:这是唯一在清除窗口后不会自动释放窗口状态的地方.

If you are using globalState in a ProcessWindowFunction, the globalState for stale keys will accumulate. You can use state TTL on the state descriptor for the globalState. Note: this is the only place where window state isn't automatically freed when windows are cleared.

或者可能只是您的关键空间随着时间的推移而增长,除了扩大集群规模之外,实际上没有其他可做的事情.

Or it may simply be that your key space is growing over time, and there's really nothing that can be done except to scale up the cluster.

这篇关于为什么我的Flink窗口使用了这么多状态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆