根据行顺序进行分组 [英] Grouping based on sequence of rows
问题描述
我有一张订单表,其中一列表示是买还是卖,通常按时间戳对行进行排序。我想做的是对连续购买的群组以及他们的出售进行操作。例如 B B S B S B B S-> (BBS)(BS)(BBS)
I have a table of orders with a column denoting whether it's a buy or a sell, with the rows typically ordered by timestamp. What I'd like to do is operate on groups of consecutive buys, plus their sell. e.g. B B S B S B B S -> (B B S) (B S) (B B S)
示例:
order_action | timestamp
-------------+---------------------
buy | 2013-10-03 13:03:02
buy | 2013-10-08 13:03:02
sell | 2013-10-10 15:58:02
buy | 2013-11-01 09:30:02
buy | 2013-11-01 14:03:02
sell | 2013-11-07 10:34:02
buy | 2013-12-03 15:46:02
sell | 2013-12-09 16:00:03
buy | 2013-12-11 13:02:02
sell | 2013-12-18 15:59:03
最后我将运行聚合功能(分组是为了让我可以根据其卖出订单排除整个分组),所以 GROUP BY
或分区的窗口似乎是正确的方法,但是我可以
I'll be running an aggregation function in the end (the groups are so that I can exclude an entire group based on its sell order), so GROUP BY
or partitioned windows seemed like the right way to go, but I can't figure out how to get this specific grouping.
推荐答案
这可能 出奇的简单 ,其中 count()
作为窗口聚合函数:
This can be surprisingly simple with count()
as window aggregate function:
SELECT *
,count(order_action = 'sell' OR NULL) OVER (ORDER BY ts DESC) AS grp
FROM orders;
使用 ts
代替 timestamp
作为列名。避免使用保留的单词作为标识符。
Using ts
instead as timestamp
as column name. Avoid reserved words as identifiers.
count()
仅计算非空值。表达式 order_action ='sell'或NULL
的结果是 TRUE
表示'sell'和 NULL
否则。 count()
返回带有默认框架定义,从框架的开始(在这种情况下为整个表格)一直到当前行的(最后一个对等)。销售组的运行计数按要求对您的行进行分组。
我要在 OVER
子句中命令降序结束于尾随的卖出,而不是领先的卖出。这导致组号降序。但这没关系,您只需要组号。
重复的时间戳将是一个问题(无论如何!)。
count()
only counts non-null values. The expression order_action = 'sell' OR NULL
results in TRUE
for 'sell' and NULL
otherwise. count()
returns a running count with the default frame definition from the start of the frame (the whole table in this case) up to the (last peer of) the current row. The running count of sells groups your rows as requested.
I am ordering descending in the OVER
clause to let each group end at a trailing "sell", not a leading "sell". This results in descending group numbers. But that should not matter, you just needed group numbers.
Duplicate timestamps would be a problem (in any case!).
一种方法升序组号:使用自定义的 FRAME
窗口功能的定义:
One way for ascending group numbers: use a custom FRAME
definition for the window function:
SELECT *
,count(order_action = 'sell' OR NULL)
OVER (ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS grp
FROM orders;
SQL小提琴 演示了这两者。
SQL Fiddle demonstrating both.
这篇关于根据行顺序进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!