GROUP BY 和聚合顺序数值 [英] GROUP BY and aggregate sequential numeric values
问题描述
使用 PostgreSQL 9.0.
Using PostgreSQL 9.0.
假设我有一个包含以下字段的表:company
、profession
和 year
.我想返回一个结果,其中包含唯一的公司和专业,但根据数字序列聚合(到数组中很好)年份:
Let's say I have a table containing the fields: company
, profession
and year
. I want to return a result which contains unique companies and professions, but aggregates (into an array is fine) years based on numeric sequence:
示例表:
+-----------------------------+
| company | profession | year |
+---------+------------+------+
| Google | Programmer | 2000 |
| Google | Sales | 2000 |
| Google | Sales | 2001 |
| Google | Sales | 2002 |
| Google | Sales | 2004 |
| Mozilla | Sales | 2002 |
+-----------------------------+
我对输出类似于以下行的查询感兴趣:
I'm interested in a query which would output rows similar to the following:
+-----------------------------------------+
| company | profession | year |
+---------+------------+------------------+
| Google | Programmer | [2000] |
| Google | Sales | [2000,2001,2002] |
| Google | Sales | [2004] |
| Mozilla | Sales | [2002] |
+-----------------------------------------+
基本特征是只有连续年份应组合在一起.
The essential feature is that only consecutive years shall be grouped together.
推荐答案
@a_horse_with_no_name 的答案,既是正确的解决方案,又如我在评论中所说,是学习如何在 PostgreSQL 中使用不同类型窗口函数的好材料.
There's much value to @a_horse_with_no_name's answer, both as a correct solution and, like I already said in a comment, as a good material for learning how to use different kinds of window functions in PostgreSQL.
然而,我不禁感到,该答案中采用的方法对于像这样的问题来说有点过于努力了.基本上,在继续在数组中聚合年份之前,您需要一个额外的分组标准.你已经有了company
和profession
,现在你只需要一些东西来区分属于不同序列的年份.
And yet I cannot help feeling that the approach taken in that answer is a bit too much of an effort for a problem like this one. Basically, what you need is an additional criterion for grouping before you go on aggregating years in arrays. You've already got company
and profession
, now you only need something to distinguish years that belong to different sequences.
这正是上述答案所提供的,而这正是我认为可以以更简单的方式完成的.方法如下:
That is just what the above mentioned answer provides and that is precisely what I think can be done in a simpler way. Here's how:
WITH MarkedForGrouping AS (
SELECT
company,
profession,
year,
year - ROW_NUMBER() OVER (
PARTITION BY company, profession
ORDER BY year
) AS seqID
FROM atable
)
SELECT
company,
profession,
array_agg(year) AS years
FROM MarkedForGrouping
GROUP BY
company,
profession,
seqID
这篇关于GROUP BY 和聚合顺序数值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!