带有 onEventTime 触发器的 Flink 会话窗口? [英] Flink session window with onEventTime trigger?

查看:26
本文介绍了带有 onEventTime 触发器的 Flink 会话窗口?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 Flink 中创建一个基于 EventTime 的会话窗口,以便在新消息的事件时间比创建该窗口的消息的事件时间大 180 秒时触发.

I want to create an EventTime based session-window in Flink, such that it triggers when the event time of a new message is more than 180 seconds greater than the event time of the message, that created the window.

例如:

t1(0 seconds)  : msg1  <-- This is the first message which causes the session-windows to be created
t2(13 seconds) : msg2
t3(39 seconds) : msg3
.
.
.
.
t7(190 seconds) : msg7 <-- The event time (t7) is more than 180 seconds than t1 (t7 - t1 = 190), so the window should be triggered and processed now.
t8(193 seconds) : msg8 <-- This message, and all subsequent messages have to be ignored as this window was processed at t7

我想创建一个触发器,以便通过适当的水印或 onEventTime 触发器实现上述行为.任何人都可以提供一些示例来实现这一目标吗?

I want to create a trigger such that the above behavior is achieved through appropriate watermark or onEventTime trigger. Can anyone please provide some examples to achieve this?

推荐答案

解决此问题的最佳方法可能是使用 ProcessFunction,而不是使用自定义窗口.如果如您的示例所示,事件将按时间戳顺序处理,那么这将非常简单.另一方面,如果您必须处理无序事件(这在处理事件时间数据时很常见),则会稍微复杂一些.(想象一下,时间为 187 的 msg6 在 t8 之后到达.如果可能,并且如果这会影响您想要产生的结果,那么必须处理.)

The best way to approach this might be with a ProcessFunction, rather than with custom windowing. If, as shown in your example, the events will be processed in timestamp order, then this will be pretty straightforward. If, on the other hand, you have to handle out-of-order events (which is common when working with event time data), it will be somewhat more complex. (Imagine that msg6 with for time 187 arrives after t8. If that's possible, and if that will affect the results you want to produce, then this has to be handled.)

如果事件是有序的,那么逻辑大概是这样的:

If the events are in order, then the logic would look roughly like this:

使用 AscendingTimestampExtractor 作为水印的基础.

Use an AscendingTimestampExtractor as the basis for watermarking.

使用 Flink 状态(可能是 ListState)来存储窗口内容.当事件到达时,将其添加到窗口并检查自第一个事件以来是否已经超过 180 秒.如果是,请处理窗口内容并清除列表.

Use Flink state (perhaps ListState) to store the window contents. When an event arrives, add it to the window and check to see if it has been more than 180 seconds since the first event. If so, process the window contents and clear the list.

如果您的事件可能是无序的,则使用 BoundedOutOfOrdernessTimestampExtractor,并且在 currentWatermark 指示事件时间已超过窗口开始时间 180 秒之前不要处理窗口的内容(您可以使用事件时间计时器为了这).触发窗口时不要完全清除列表,而只是删除属于正在关闭的窗口的元素.

If your events can be out-of-order, then use a BoundedOutOfOrdernessTimestampExtractor, and don't process the window's contents until currentWatermark indicates that event time has passed 180 seconds past the window's start time (you can use an event time timer for this). Don't completely clear the list when triggering a window, but just remove the elements that belong to the window that is closing.

这篇关于带有 onEventTime 触发器的 Flink 会话窗口?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆