Vertica 中每组查询的前 N ​​个 [英] Top N per group query in Vertica

查看:31
本文介绍了Vertica 中每组查询的前 N ​​个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个老问题 - 在 Vertica 中寻找最佳解决方案.想象一个有列的表:-

This is an old problem - looking for the best solution in Vertica. Imagine a table with columns:-

A, B, C, D, E

A-D 列是整数或 varchars,E 列是时间戳列,其默认值为 GETUTCDATE().

Columns A-D are ints or varchars and column E is a timestamptz column that has a default value of GETUTCDATE().

表格内容示例:-

1, 2, "AAA", 4, 1404305559
1, 2, "BBB", 23, 1404305633
1, 2, "CCC", 62, 1404305705  <-- the max entry for (1,2,"CCC")
1, 2, "AAA", 123, 1404305740 <-- the max entry for (1,2,"AAA")
1, 2, "BBB", 91, 1404305778  <-- the max entry for (1,2,"BBB")

因此复合 (A,B,C) 值可能存在重复行(D 列是一个值,E 列是时间戳).

So potentially there are repeating rows for the composite (A,B,C) value (with column D being a value and column E the timestamp).

我想要一个结果集,显示每个唯一 (A,B,C) 组合的最新行及其值.因此,上述结果集将如下所示:-

I'd like a resultset that showed, for each unique (A,B,C) combination, the latest row and its value. Hence the resultset for the above would look like:-

1, 2, "CCC", 62, 1404305705  
1, 2, "AAA", 123, 1404305740 
1, 2, "BBB", 91, 1404305778  

推荐答案

让我们设置示例数据:

CREATE TABLE public.test (
  A int,
  B int,
  C varchar,
  D int, 
  E int
);

INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'AAA', 4, 1404305559);
INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'BBB', 23, 1404305633);
INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'CCC', 62, 1404305705);
INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'AAA', 123, 1404305740);
INSERT INTO public.test (A, B, C, D, E) VALUES (1, 2, 'BBB', 91, 1404305778);

COMMIT;

我们将使用 RANK 函数根据 A、B、C 对每一行进行排序,并在 E 上排序并仅返回行位于顶部(排名为 1).

We'll use the RANK function to rank each row based on A, B, C and sort on E and return only the rows that are at the top (have a rank of 1).

SELECT a.a, 
       a.b, 
       a.c, 
       a.d, 
       a.e 
FROM   (SELECT a, 
               b, 
               c, 
               d, 
               e, 
               RANK() 
                 OVER ( 
                   PARTITION BY a, b, c 
                   ORDER BY e DESC) AS rank 
        FROM   public.test) a 
WHERE  a.rank = 1; 

返回:

 A | B |  C  |  D  |     E
---+---+-----+-----+------------
 1 | 2 | CCC |  62 | 1404305705
 1 | 2 | AAA | 123 | 1404305740
 1 | 2 | BBB |  91 | 1404305778

这篇关于Vertica 中每组查询的前 N ​​个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆