如何在postgres查询中排名 [英] How to rank in postgres query

查看:74
本文介绍了如何在postgres查询中排名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对表中的数据子集进行排名,但是我认为我做错了什么。我找不到太多有关postgres的rank()功能的信息,也许我找错了地方。不管哪种方式:

I'm trying to rank a subset of data within a table but I think I am doing something wrong. I cannot find much information about the rank() feature for postgres, maybe I'm looking in the wrong place. Either way:

我想知道基于日期的,属于表簇的id的排名。我的查询如下:

I'd like to know the rank of an id that falls within a cluster of a table based on a date. My query is as follows:

select cluster_id,feed_id,pub_date,rank 
from (select feed_id,pub_date,cluster_id,rank() 
    over (order by pub_date asc) from url_info) 
as bar where cluster_id = 9876 and feed_id = 1234;

我在以下stackoverflow帖子之后对此进行建模: postgres排名

I'm modeling this after the following stackoverflow post: postgres rank

我认为我做错事的原因是url_info中只有39行位于cluster_id 9876中,并且该查询运行了10分钟,再也没有回来。 (实际上重新运行了一段时间,它没有返回任何结果,但是在群集9876中有一行ID为1234的行)。我希望这会告诉我 id 1234是给定条件的第5位。会根据我的查询约束返回相对排名,对吗?

The reason I think I am doing something wrong is that there are only 39 rows in url_info that are in cluster_id 9876 and this query ran for 10 minutes and never came back. (actually re-ran it for quite a while and it returned no results, yet there is a row in cluster 9876 for id 1234) I'm expecting this will tell me something like "id 1234 was 5th for the criteria given). It will return a relative rank according to my query constraints, correct?

这是Postgres 8.4 btw。

This is postgres 8.4 btw.

推荐答案

通过将rank()函数放置在子选择中,而不在over子句或该子选择中的任何谓词中指定PARTITION BY,您的查询将要求在整个url_info表上生成一个排名$ p
这可能是为什么它要运行到只要对所有url_info进行排名的原因,Pg必须按pub_date对整个表进行排序,这会花费一些时间。

By placing the rank() function in the subselect and not specifying a PARTITION BY in the over clause or any predicate in that subselect, your query is asking to produce a rank over the entire url_info table ordered by pub_date. This is likely why it ran so long as to rank over all of url_info, Pg must sort the entire table by pub_date, which will take a while if the table is very large.

似乎您只想为where子句选择的记录集生成一个等级,在这种情况下,您需要做的就是消除subselect,并且rank函数隐含在匹配的记录集上

It appears you want to generate a rank for just the set of records selected by the where clause, in which case, all you need do is eliminate the subselect and the rank function is implicitly over the set of records matching that predicate.

select 
  cluster_id
 ,feed_id
 ,pub_date
 ,rank() over (order by pub_date asc) as rank
from url_info
where cluster_id = 9876 and feed_id = 1234;

如果您真正想要的是集群中的排名,则无论feed_id如何,您都可以在筛选该集群的子选择:

If what you really wanted was the rank within the cluster, regardless of the feed_id, you can rank in a subselect which filters to that cluster:

select ranked.*
from (
  select 
    cluster_id
   ,feed_id
   ,pub_date
   ,rank() over (order by pub_date asc) as rank
  from url_info
  where cluster_id = 9876
) as ranked
where feed_id = 1234;

这篇关于如何在postgres查询中排名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆