KSQL 跳跃窗口:仅访问最旧的子窗口 [英] KSQL Hopping Window : accessing only oldest subwindow
问题描述
我正在使用如下所示的查询跟踪特定字段的滚动总和:
I am tracking the rolling sum of a particular field by using a query which looks something like this :
SELECT id, SUM(quantity) AS quantity from stream \
WINDOW HOPPING (SIZE 1 MINUTE, ADVANCE BY 10 SECONDS) \
GROUP BY id;
现在,对于每个输入滴答声,它似乎会返回 6 个不同的聚合值,我猜它们是针对以下时间段的:
Now, for every input tick, it seems to return me 6 different aggregated values I guess which are for the following time periods :
[start, start+60] seconds
[start+10, start+60] seconds
[start+20, start+60] seconds
[start+30, start+60] seconds
[start+40, start+60] seconds
[start+50, start+60] seconds
如果我只对每个进入的滴答声获得 [start, start+60] 秒结果怎么办.无论如何只能得到那个结果?
What if I am interested is only getting the [start, start+60] seconds result for every tick that comes in. Is there anyway to get ONLY that?
推荐答案
因为你指定了一个跳跃窗口,所以每条记录落入多个窗口,处理一条记录时需要更新所有窗口.只更新一个窗口是不正确的,结果是错误的.
Because you specify a hopping window, each record falls into multiple windows and all windows need to be updated when processing a record. Updating only one window would be incorrect and the result would be wrong.
比较关于跳跃窗口的 Kafka Streams 文档(Kafka Streams 是 KSQL 的内部运行时引擎):https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#hopping-time-windows
Compare the Kafka Streams docs about hopping windows (Kafka Streams is KSQL's internal runtime engine): https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#hopping-time-windows
更新
Kafka Streams 通过 KIP-450 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-450%3A+Sliding+Window+Aggregations+in+the+DSL).这也应该允许稍后向 ksqlDB 添加滑动窗口.
Kafka Streams is adding proper sliding window support via KIP-450 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-450%3A+Sliding+Window+Aggregations+in+the+DSL). This should allow to add sliding window to ksqlDB later, too.
这篇关于KSQL 跳跃窗口:仅访问最旧的子窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!