返回在Postgresql中最多为一列的行 [英] Return rows that are max of one column in Postgresql
问题描述
我表中的示例数据 test_table
:
date symbol value created_time
2010-01-09 symbol1 101 3847474847
2010-01-10 symbol1 102 3847474847
2010-01-10 symbol1 102.5 3847475500
2010-01-10 symbol2 204 3847474847
2010-01-11 symbol1 109 3847474847
2010-01-12 symbol1 105 3847474847
2010-01-12 symbol2 206 3847474847
鉴于上表,我试图找到要放在表上的最佳索引(日期,符号,值和created_time应该组合为唯一),然后执行查询并返回以下内容:
Given the table above, I am trying to find the optimal index to put on the table (date, symbol, value and created_time should combined be unique) and the query to go along with it to return the following:
date symbol value created_time
2010-01-09 symbol1 101 3847474847
2010-01-10 symbol1 102.5 3847475500
2010-01-10 symbol2 204 3847474847
2010-01-11 symbol1 109 3847474847
2010-01-12 symbol1 105 3847474847
2010-01-12 symbol2 206 3847474847
我正在寻找具有最大created_time列的这三组中每个组的日期,符号,值数据列(在上面的示例中基本上是行1、3、4、5、6、7返回)。
I am looking for date, symbol, value columns of data for each group of those three with the maximum created_time column (essentially row 1, 3, 4, 5, 6, 7 in the example above returned).
当前我已经尝试过该索引...
Currently I have tried this index...
CREATE UNIQUE INDEX "test_table_date_symbol_value_created_time"
ON "test_table" USING btree (date, symbol, value, created_time)
正在使用此查询。不知道这是否是最有效的方法,它似乎仍然很慢。
And am using this query. Not sure if it is the most effective way, it still seems pretty slow.
select *
from(
select date,
symbol,
value,
created_time,
max(created_time) over (partition by date, symbol) as max_created_time
from "test_table"
) t
where symbol in ('symbol1', 'symbol2') and created_time = max_created_time
推荐答案
Postgres支持适合这种情况的窗口函数:
Postgres supports window functions that suit this situation:
select date, symbol, value, created_time
from (select *,
rank() over (partition by date, symbol order by created_time desc) as rownum
from test_table) x
where rownum = 1
对于日期的每个组合
,符号
,此查询返回值
和 created_time
最高(即 last ) created_time $
日期和
符号
中的c $ c>。
For every combination of date
, symbol
, this query returns the value
and created_time
from the row with the highest (ie last) created_time
of that date
and symbol
.
我建议使用该索引:
CREATE UNIQUE INDEX test_table_idx
ON test_table (date, symbol, created_time, value)
这是一个 covering 索引(具有查询所需的所有值,消除了访问实际表的需要,而您已经拥有了),但请注意,之前 值<
created_time
/ code>,因此数据已经按其分区顺序排列,并且 value
是最不重要的属性,因为它不参与任何确定要插入哪一行的决定。返回。
It's a covering index (has all values you need for the query, obviating the need to access the actual table, and which you already had), but note that created_time
comes before value
, so data is already in its partition order, and value
is the least important attribute, because it doesn't participate in any determination of which row to return.
这篇关于返回在Postgresql中最多为一列的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!