Apache 光束窗口:考虑晚期数据但只发出一个窗格 [英] Apache beam windowing: consider late data but emit only one pane
问题描述
当水印到达窗口末尾 x 分钟时,我想发出单个窗格.这让我确保我处理了一些迟到的数据,但仍然只发出一个窗格.我目前在 Java 工作.
I would like to emit a single pane when the watermark reaches x minutes past the end of the window. This let's me ensure I handle some late data, but still only emit one pane. I am currently working in java.
目前我无法找到此问题的适当解决方案.当水印到达窗口的末尾时,我可以发出单个窗格,但随后会丢弃任何迟到的数据.我可以在窗口末尾发出窗格,然后在收到延迟数据时再次发出窗格,但在这种情况下,我不会发出单个窗格.
At the moment I can't find proper solutions to this problem. I could emit a single pane when the watermark reaches the end of the window, but then any late data is dropped. I could emit the pane at the end of the window and then again when I receive late data, however in this case I am not emitting a single pane.
我目前有类似的代码:
.triggering(
// This is going to emit the pane, but I don't want emit the pane yet!
AfterWatermark.pastEndOfWindow()
// This is going to emit panes each time I receive late data, however
// I would like to only emit one pane at the end of the allowedLateness
).withAllowedLateness(allowedLateness).accumulatingFiredPanes())
如果仍有混淆,我只想在水印通过 allowedLateness
时只发出一个窗格.
In case there is still confusion, I would like to only emit a single pane when the watermark passes the allowedLateness
.
推荐答案
谢谢 Guillem,最后我用你的回答找到了这个 非常有用的链接,里面有很多 apache 光束示例.由此我想出了以下解决方案:
Thanks Guillem, in the end I used your answer to find this very useful link with lots of apache beam examples. From this I came up with the following solution:
// We first specify to never emit any panes
.triggering(Never.ever())
// We then specify to fire always when closing the window. This will emit a
// single final pane at the end of allowedLateness
.withAllowedLateness(allowedLateness, Window.ClosingBehavior.FIRE_ALWAYS)
.discardingFiredPanes())
这篇关于Apache 光束窗口:考虑晚期数据但只发出一个窗格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!