使用 Order By 计算分区中的行数 [英] Count rows in partition with Order By

查看:67
本文介绍了使用 Order By 计算分区中的行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过编写一些示例查询来理解 postgres 中的 PARTITION BY.我有一个用于运行查询的测试表.

id 整数 |num 整数___________|_____________1 |42 |43 |54 |6

当我运行以下查询时,我得到了预期的输出.

SELECT id, COUNT(id) OVER(PARTITION BY num) from test;身份证 |数数___________|_____________1 |22 |23 |14 |1

但是,当我将 ORDER BY 添加到分区时,

SELECT id, COUNT(id) OVER(PARTITION BY num ORDER BY id) from test;身份证 |数数___________|_____________1 |12 |23 |14 |1

我的理解是 COUNT 是跨分区中的所有行计算的.在这里,我按 num 对行进行了分区.分区中的行数是相同的,有或没有 ORDER BY 子句.为什么输出有差异?

解决方案

当您将 order by 添加到用作窗口函数的聚合时,聚合会变成运行计数"(或其他任何您使用的聚合).

count(*) 将根据指定的顺序返回直到当前行"的行数.

以下查询显示了与 order by 一起使用的聚合的不同结果.使用 sum() 而不是 count() 更容易看到(在我看来).

with test (id, num, x) as (价值观(1, 4, 1),(2, 4, 1),(3, 5, 2),(4, 6, 2))选择身份证,数,X,count(*) over () 作为 total_rows,count(*) over (order by id) as rows_upto,count(*) over (part by x order by id) as rows_per_x,sum(num) over (partition by x) as total_for_x,sum(num) over (order by id) as sum_upto,sum(num) over (part by x order by id) as sum_for_x_upto从测试;

将导致:

id |数量 |× |total_rows |rows_upto |rows_per_x |total_for_x |sum_upto |sum_for_x_upto---+-----+---+------------+-----------+------------+-------------+----------+---------------1 |4 |1 |4 |1 |1 |8 |4 |42 |4 |1 |4 |2 |2 |8 |8 |83 |5 |2 |4 |3 |1 |11 |13 |54 |6 |2 |4 |4 |2 |11 |19 |11

Postgres 手册中有更多示例>

I was trying to understand PARTITION BY in postgres by writing a few sample queries. I have a test table on which I run my query.

id integer | num integer
___________|_____________
1          | 4 
2          | 4
3          | 5
4          | 6

When I run the following query, I get the output as I expected.

SELECT id, COUNT(id) OVER(PARTITION BY num) from test;

id         | count
___________|_____________
1          | 2 
2          | 2
3          | 1
4          | 1

But, when I add ORDER BY to the partition,

SELECT id, COUNT(id) OVER(PARTITION BY num ORDER BY id) from test;

id         | count
___________|_____________
1          | 1 
2          | 2
3          | 1
4          | 1

My understanding is that COUNT is computed across all rows that fall into a partition. Here, I have partitioned the rows by num. The number of rows in the partition is the same, with or without an ORDER BY clause. Why is there a difference in the outputs?

解决方案

When you add an order by to an aggregate used as a window function that aggregate turns into a "running count" (or whatever aggregate you use).

The count(*) will return the number of rows up until the "current one" based on the order specified.

The following query shows the different results for aggregates used with an order by. With sum() instead of count() it's a bit easier to see (in my opinion).

with test (id, num, x) as (
  values 
    (1, 4, 1),
    (2, 4, 1),
    (3, 5, 2),
    (4, 6, 2)
)
select id, 
       num,
       x,
       count(*) over () as total_rows, 
       count(*) over (order by id) as rows_upto,
       count(*) over (partition by x order by id) as rows_per_x,
       sum(num) over (partition by x) as total_for_x,
       sum(num) over (order by id) as sum_upto,
       sum(num) over (partition by x order by id) as sum_for_x_upto
from test;

will result in:

id | num | x | total_rows | rows_upto | rows_per_x | total_for_x | sum_upto | sum_for_x_upto
---+-----+---+------------+-----------+------------+-------------+----------+---------------
 1 |   4 | 1 |          4 |         1 |          1 |           8 |        4 |              4
 2 |   4 | 1 |          4 |         2 |          2 |           8 |        8 |              8
 3 |   5 | 2 |          4 |         3 |          1 |          11 |       13 |              5
 4 |   6 | 2 |          4 |         4 |          2 |          11 |       19 |             11

There are more examples in the Postgres manual

这篇关于使用 Order By 计算分区中的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆