PostgreSQL:使用窗口函数对组大小进行分组 [英] Postgresql: Grouping with limit on group size using window functions

查看:129
本文介绍了PostgreSQL:使用窗口函数对组大小进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Postgresql中是否有一种方法可以写一个查询,该查询根据具有限制的列对行进行分组而不会丢弃其他行。



说我有一个表具有三列 id,颜色,分数和以下行

  1红色10.0 
2红色7.0
3红色3.0
4蓝色5.0
5绿色4.0
6蓝色2.0
7蓝色1.0

我可以使用以下查询通过带有窗口函数的颜色进行分组

  SELECT * FROM(
SELECT id,color,score,rank()
OVER(PARTITION BY color ORDER BY score DESC)
FROM grouping_test
)AS foo WHERE rank< = 2;

结果

  id |颜色得分|等级
---- + ------- + ------- + ------
4 |蓝色| 5.0 | 1
6 |蓝色| 2.0 | 2
5 |绿色| 4.0 | 1
1 |红色| 10.0 | 1
2 |红色| 7.0 | 2

它会丢弃等级大于2的项目。但是我需要的是类似的结果

  1红色10.0 
2红色7.0
4蓝色5.0
6蓝色2.0
5绿色4.0
3红色3.0
7蓝色1.0

没有被丢弃的行。 / p>

编辑:
要更准确地了解我需要的逻辑,


  1. 让我获得得分最高的行

  2. 具有相同颜色和最高得分的下一行

  3. 在其余项目中得分最高的项目

  4. 与2相同,但对于3中的行。

    ...

只要找到具有相同颜色的对,就继续,然后按降序排序剩下的。



可以找到测试表的导入语句这里
感谢您的帮助。

解决方案

可以使用两个嵌套的窗口函数来完成

 选择
id
FROM(
选择
id,
颜色,
分数,
((rank()OVER color_window)-1)/ 2 AS rank_window_id
FROM grouping_test
WINDOW color_window AS(PARTITION BY color ORDER BY score DESC)
)as foo
WINDOW rank_window AS(PARTITION BY(color,rank_window_id))
ORDER BY
(max(score)OVER rank_window)DESC,
color;

其中 2 是组的参数大小。


Is there a way in Postgresql to write a query which groups rows based on a column with a limit without discarding additional rows.

Say I've got a table with three columns id, color, score with the following rows

1 red 10.0
2 red 7.0
3 red 3.0
4 blue 5.0
5 green 4.0
6 blue 2.0
7 blue 1.0

I can get a grouping based on color with window functions with the following query

SELECT * FROM (
    SELECT id, color, score, rank()
    OVER (PARTITION BY color ORDER BY score DESC)
    FROM grouping_test
) AS foo WHERE rank <= 2;

with the result

  id | color | score | rank 
 ----+-------+-------+------
   4 | blue  |   5.0 |    1
   6 | blue  |   2.0 |    2
   5 | green |   4.0 |    1
   1 | red   |  10.0 |    1
   2 | red   |   7.0 |    2

which discards item with ranks > 2. However what I need is a result like

1 red 10.0
2 red 7.0
4 blue 5.0
6 blue 2.0
5 green 4.0
3 red 3.0
7 blue 1.0

With no discarded rows.

Edit: To be more precise about the logic I need:

  1. Get me the row with the highest score
  2. The next row with the same color and the highest possible score
  3. The item with the highest score of the remaining items
  4. Same as 2., but for the row from 3.
    ...

Continue as long as pairs with the same color can be found, then order whats left by descending score.

The import statements for a test table can be found here. Thanks for your help.

解决方案

It can be done using two nested window functions

SELECT
  id
FROM (
  SELECT
    id,
    color,
    score,
    ((rank() OVER color_window) - 1) / 2 AS rank_window_id
  FROM grouping_test
  WINDOW color_window AS (PARTITION BY color ORDER BY score DESC)
) as foo
WINDOW rank_window AS (PARTITION BY (color, rank_window_id))
ORDER BY
  (max(score) OVER rank_window) DESC,
  color;

With 2 being the parameter of the group size.

这篇关于PostgreSQL:使用窗口函数对组大小进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆