为什么我的 Flink 窗口使用这么多状态? [英] Why are my Flink windows using so much state?

查看:28
本文介绍了为什么我的 Flink 窗口使用这么多状态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的 Flink 作业的检查点越来越大.在深入研究单个任务后,键控窗口函数似乎对大部分大小负责.我怎样才能减少这种情况?

The checkpoints for my Flink job are getting larger and larger. After drilling down into individual tasks, the keyed window function seems to be responsible for most of the size. How can I reduce this?

推荐答案

如果你在 windows 中绑定了很多状态,有几种可能:

If you have a lot of state tied up in windows, there are several possibilities:

  • 使用增量聚合(通过使用 reduceaggregate)可以显着降低您的存储需求.否则,每个事件都会被复制到分配给每个窗口的事件列表中.

  • Using incremental aggregation (by using reduce or aggregate) can dramatically reduce your storage requirements. Otherwise each event is being copied into the list of events assigned to each window.

如果您在多个时间范围内聚合,例如每分钟和每 10 分钟,您可以级联这些窗口,以便 10 分钟窗口仅消耗分钟窗口的输出,而不是每活动.

If you are aggregating over multiple timeframes, e.g., every minute and every 10 minutes, you can cascade these windows, so that the 10 minute windows are only consuming the output of the minute-long windows, rather than every event.

如果您使用滑动窗口,则每个事件都会分配给每个重叠窗口.例如,如果您的窗口时长 2 分钟并滑动 1 秒,则每个事件都会被复制到 120 个窗口中.增量和/或预聚合在这里会有所帮助(很多!),或者您可能想要使用 KeyedProcessFunction 而不是窗口来优化您的状态足迹.

If you are using sliding windows, each event is being assigned to each of the overlapping windows. For example, if your windows are 2 minutes long and sliding by 1 second, each event is being copied into 120 windows. Incremental and/or pre-aggregation will help here (a lot!), or you may want to use a KeyedProcessFunction instead of a window in order to optimize your state footprint.

如果您有键控计数窗口,您可能会拥有从未(或仅非常缓慢)达到所需批量大小的键,从而导致越来越多的部分批次处于状态.您可以实现一个自定义 Trigger,除了基于计数的触发之外,它还包含超时,以便最终处理这些部分批次.

If you have keyed count windows, you could have keys for which the requisite batch size is never (or only very slowly) reached, leading to more and more partial batches sitting around in state. You could implement a custom Trigger that incorporates a timeout in addition to the count-based triggering so that these partial batches are eventually processed.

如果您在 ProcessWindowFunction 中使用 globalState,则过时键的 globalState 将累积.您可以在 globalState 的状态描述符上使用状态 TTL.注意:这是唯一一个在清除窗口时不会自动释放窗口状态的地方.

If you are using globalState in a ProcessWindowFunction, the globalState for stale keys will accumulate. You can use state TTL on the state descriptor for the globalState. Note: this is the only place where window state isn't automatically freed when windows are cleared.

或者可能只是您的密钥空间随着时间的推移而增长,除了扩展集群之外,真的没有什么可以做的.

Or it may simply be that your key space is growing over time, and there's really nothing that can be done except to scale up the cluster.

这篇关于为什么我的 Flink 窗口使用这么多状态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆