PostgreSQL-“ DISTINCT ON”;和“ GROUP BY”句法 [英] PostgreSQL - "DISTINCT ON" and "GROUP BY" syntax

查看:643
本文介绍了PostgreSQL-“ DISTINCT ON”;和“ GROUP BY”句法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我意识到数据库查询返回了意外结果,这是由于我不正确地使用了 DISTINCT ON和 GROUP BY

I realized that a database query was returning unexpected results do to my improper use of "DISTINCT ON" and "GROUP BY"

我希望有人可以设置我直接对此。实际的查询非常复杂,因此我将其简化:

I'm hoping someone can set me straight on this. The actual query is quite complex , so I'll dumb it down :

我有一个表/内部查询,其中包含一个object_id和一个时间戳:

I have a table/inner query that consists of an object_id and a timestamp:

CREATE TABLE test_select ( object_id INT , event_timestamp timestamp );
COPY test_select (object_id , event_timestamp) FROM stdin (DELIMITER '|');
1           | 2013-01-27 21:01:20
1           | 2012-06-28 14:36:26
1           | 2013-02-21 04:16:48
2           | 2012-06-27 19:53:05
2           | 2013-02-03 17:35:58
3           | 2012-06-14 20:17:00
3           | 2013-02-15 19:03:34
4           | 2012-06-13 13:59:47
4           | 2013-02-23 06:31:16
5           | 2012-07-03 01:45:56
5           | 2012-06-11 21:33:26
\.

我正在尝试选择一个不同的ID,该ID由倒计时上的时间戳排序/去重复

I'm trying to select a distinct ID , ordered/deduplicated by the timestamp on reverse chron

所以结果应该是[4,1,3,2,5]

so the results should be [ 4, 1, 3, 2, 5 ]

我认为这可以做到需要(似乎):

I think this does what I need (it seems to ):

SELECT object_id  
FROM test_select 
GROUP BY object_id 
ORDER BY max(event_timestamp) DESC
;

出于测试/审计的目的,有时我想包括timestamp字段。我似乎无法弄清楚如何在该查询中包括另一个字段。

For testing/auditing purposes , I sometimes want to include the timestamp field. I can't seem to figure out how to include another field with that query.

任何人都可以指出上述sql中明显的问题,或有关如何包括审核信息?

Can anyone point out glaring problems in my sql above, or suggestions on how to include the auditing info ?

推荐答案

能够选择所有列,而不仅限于 object_id MAX(event_timestamp),您可以使用 DISTINCT ON

To be able to select all columns and not only object_id and MAX(event_timestamp), you can use DISTINCT ON

SELECT DISTINCT ON (object_id) 
    object_id, event_timestamp    ---, more columns
FROM test_select 
ORDER BY object_id, event_timestamp DESC ;

如果要按 event_timestamp DESC排序的结果而不是 object_id ,您需要将其包括在派生表或CTE中:

If you want the results ordered by event_timestamp DESC and not by object_id, you need to include it in a derived table or a CTE:

SELECT *
FROM 
  ( SELECT DISTINCT ON (object_id) 
        object_id, event_timestamp    ---, more columns
    FROM test_select 
    ORDER BY object_id, event_timestamp DESC 
  ) AS t
ORDER BY event_timestamp DESC ;

或者,您可以使用窗口功能,例如 ROW_NUMBER()

Alternatively, you can use window functions, like ROW_NUMBER():

WITH cte AS
  ( SELECT ROW_NUMBER() OVER (PARTITION BY object_id 
                              ORDER BY event_timestamp DESC) 
             AS rn, 
           object_id, event_timestamp    ---, more columns
    FROM test_select 
  )
SELECT object_id, event_timestamp    ---, more columns
FROM cte
WHERE rn = 1
ORDER BY event_timestamp DESC ;

或总计 MAX() OVER

WITH cte AS
  ( SELECT MAX(event_timestamp) OVER (PARTITION BY object_id) 
             AS max_event_timestamp, 
           object_id, event_timestamp    ---, more columns
    FROM test_select 
  )
SELECT object_id, event_timestamp    ---, more columns
FROM cte
WHERE event_timestamp = max_event_timestamp
ORDER BY event_timestamp DESC ;

这篇关于PostgreSQL-“ DISTINCT ON”;和“ GROUP BY”句法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆