组连续时间间隔在SQL [英] group consecutive time intervals in sql
问题描述
假设数据类型为
股票名称,操作,开始日期,结束日期
谷歌,正在增长,1 ,2
谷歌,成长,2,3
谷歌,下降,3,4
谷歌,增长,4,5
雅虎,成长,1,2
如何汇总以合并连续的时间间隔?
输出如下所示:
股票名称,操作,开始日期,结束日期
谷歌,增长,1、3
谷歌,下降,3,4
谷歌,增长,4,5
雅虎,增长,1,2
我想到了使用排名窗口函数对一个连续的常量进行编号,然后按该常量和动作/名称进行分组,但是我无法完全使它起作用,如下所示:
股票名称,操作,开始日期,结束日期,排名
google,增长,1、2、1
google,增长中,2、3、1
谷歌,下降,3、4、1
谷歌,增长中,4、5、2
雅虎,增长,1、2、1
如果这是Mysql,我很容易用变量来解决它,但这在postgres中是不可能的。
可以有任意数量的连续时间间隔,因此不能自行加入预定的nr次。
优雅(性能,可读性)。
您可以在PL / pgSQL中很好地使用变量。
我可以使用表格函数来解决这个问题。
假设表格被称为 stock
,我的代码如下:稳定
$$ DECLARE
s的股票;
期末库存;
开始
FOR
中的s选择stock_name,动作,开始日期,结束日期
从股票
按stock_name,动作,开始日期
排序
/ *这是一个新时期吗? * /
如果期限不是空的且
(period.stock_name<> s.stock_name
OR period.action<> s.action
OR period.end_date <> s.start_date)
THEN
/ *新时期,最后一个时期输出* /
RETURN NEXT时期;
period:= NULL;
ELSE
如果期限不为空
然后
/ *期限继续,更新end_date * /
period.end_date:= s.end_date;
END IF;
END IF;
/ *记住新周期的开始* /
如果周期为空
THEN
period:= s;
END IF;
END LOOP;
/ *输出最后一个期间* /
如果期间不是空的,则
然后
返回下一个期间;
END IF;
返回;
END; $$;
我会这样称呼它:
test => SELECT * FROM Combine_periods();
┌──────────┬───────┬────────┬┬──────── ─┐
│股票名称│操作│开始日期│结束日期│
├──────────┼ ──────┼──────────┤
│google│下降│3│4│
│google│成长│1│3│
│google│成长│4│5│
│雅虎│成长│1│2│
└──────────┴ ──────────┴──────────┘
(4列)
Assuming a data structure of the type
stock_name, action, start_date, end_date
google, growing, 1, 2
google, growing, 2, 3
google, falling, 3, 4
google, growing, 4, 5
yahoo, growing, 1, 2
How can I aggregate it to merge consecutive time intervals?
The output would look like:
stock_name, action, start_date, end_date
google, growing, 1, 3
google, falling, 3, 4
google, growing, 4, 5
yahoo, growing, 1, 2
I thought of using rank window function to number the consecutive with a constant and then grouping by that and action/name, but I cannot quite get it to work, something as below:
stock_name, action, start_date, end_date, rank
google, growing, 1, 2, 1
google, growing, 2, 3, 1
google, falling, 3, 4, 1
google, growing, 4, 5, 2
yahoo, growing, 1, 2, 1
If this were Mysql, I would easily solve it with variables, but this is not possible in postgres.
There could be any number of consecutive intervals, so self joining a predetermined nr of times is not an option.
Elegance(performance, readability) of solution matters.
You can use variables just fine in PL/pgSQL.
I would solve this with a table function.
Assuming the table is called stock
, my code would look like this:
CREATE OR REPLACE FUNCTION combine_periods() RETURNS SETOF stock
LANGUAGE plpgsql STABLE AS
$$DECLARE
s stock;
period stock;
BEGIN
FOR s IN
SELECT stock_name, action, start_date, end_date
FROM stock
ORDER BY stock_name, action, start_date
LOOP
/* is this a new period? */
IF period IS NOT NULL AND
(period.stock_name <> s.stock_name
OR period.action <> s.action
OR period.end_date <> s.start_date)
THEN
/* new period, output last period */
RETURN NEXT period;
period := NULL;
ELSE
IF period IS NOT NULL
THEN
/* period continues, update end_date */
period.end_date := s.end_date;
END IF;
END IF;
/* remember the beginning of a new period */
IF period IS NULL
THEN
period := s;
END IF;
END LOOP;
/* output the last period */
IF period IS NOT NULL
THEN
RETURN NEXT period;
END IF;
RETURN;
END;$$;
And I would call it like this:
test=> SELECT * FROM combine_periods();
┌────────────┬─────────┬────────────┬──────────┐
│ stock_name │ action │ start_date │ end_date │
├────────────┼─────────┼────────────┼──────────┤
│ google │ falling │ 3 │ 4 │
│ google │ growing │ 1 │ 3 │
│ google │ growing │ 4 │ 5 │
│ yahoo │ growing │ 1 │ 2 │
└────────────┴─────────┴────────────┴──────────┘
(4 rows)
这篇关于组连续时间间隔在SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!