使用onEventTime触发器触发Flink会话窗口? [英] Flink session window with onEventTime trigger?

查看:315
本文介绍了使用onEventTime触发器触发Flink会话窗口?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Flink中创建一个基于EventTime的会话窗口,以便在新消息的事件时间比创建窗口的消息的事件时间长180秒以上时触发.

I want to create an EventTime based session-window in Flink, such that it triggers when the event time of a new message is more than 180 seconds greater than the event time of the message, that created the window.

例如:

t1(0 seconds)  : msg1  <-- This is the first message which causes the session-windows to be created
t2(13 seconds) : msg2
t3(39 seconds) : msg3
.
.
.
.
t7(190 seconds) : msg7 <-- The event time (t7) is more than 180 seconds than t1 (t7 - t1 = 190), so the window should be triggered and processed now.
t8(193 seconds) : msg8 <-- This message, and all subsequent messages have to be ignored as this window was processed at t7

我想创建一个触发器,以便通过适当的水印或onEventTime触发器实现上述行为.任何人都可以提供一些示例来实现这一目标吗?

I want to create a trigger such that the above behavior is achieved through appropriate watermark or onEventTime trigger. Can anyone please provide some examples to achieve this?

推荐答案

解决此问题的最佳方法可能是使用ProcessFunction,而不是使用自定义窗口.如您的示例所示,如果事件将按照时间戳顺序进行处理,那么这将非常简单.另一方面,如果您必须处理乱序事件(在处理事件时间数据时很常见),那么它将有些复杂. (想象一下,时间为187的msg6在t8之后到达.如果有可能,并且如果这会影响您想要产生的结果,则必须进行处理.)

The best way to approach this might be with a ProcessFunction, rather than with custom windowing. If, as shown in your example, the events will be processed in timestamp order, then this will be pretty straightforward. If, on the other hand, you have to handle out-of-order events (which is common when working with event time data), it will be somewhat more complex. (Imagine that msg6 with for time 187 arrives after t8. If that's possible, and if that will affect the results you want to produce, then this has to be handled.)

如果事件按顺序进行,那么逻辑将大致如下所示:

If the events are in order, then the logic would look roughly like this:

使用AscendingTimestampExtractor作为加水印的基础.

Use an AscendingTimestampExtractor as the basis for watermarking.

使用Flink状态(也许是ListState)来存储窗口内容.当事件到达时,将其添加到窗口中,并检查自第一个事件以来是否已超过180秒.如果是这样,请处理窗口内容并清除列表.

Use Flink state (perhaps ListState) to store the window contents. When an event arrives, add it to the window and check to see if it has been more than 180 seconds since the first event. If so, process the window contents and clear the list.

如果事件可能是乱序的,则使用BoundedOutOfOrdernessTimestampExtractor,直到currentWatermark指示事件时间已超过窗口开始时间180秒后,才处理窗口的内容(可以使用事件时间计时器为了这).触发窗口时,不要完全清除列表,而只是删除属于要关闭的窗口的元素.

If your events can be out-of-order, then use a BoundedOutOfOrdernessTimestampExtractor, and don't process the window's contents until currentWatermark indicates that event time has passed 180 seconds past the window's start time (you can use an event time timer for this). Don't completely clear the list when triggering a window, but just remove the elements that belong to the window that is closing.

这篇关于使用onEventTime触发器触发Flink会话窗口?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆