从表中查找排名前1000的条目以及计数和排名 [英] Find Top 1000 entries along with count and rank from table

查看:92
本文介绍了从表中查找排名前1000的条目以及计数和排名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Redshift中有一个表,其中包含约300亿行,结构如下,

I have a table with around 30 billions rows in Redshift with following structure,

userid    itemid   country   start_date
uid1     itemid1  country1  2018-07-25 00:00:00
uid2     itemid2  country1  2018-07-25 00:00:00
uid3     itemid1  country2  2018-07-25 00:00:00
uid4     itemid3  country1  2018-07-25 00:00:00
uid5     itemid1  country1  2018-07-25 00:00:00
uid1     itemid2  country2  2018-07-25 00:00:00
uid2     itemid2  country2  2018-07-25 00:00:00

在这里,我想查找有多少不重复用户购买的商品,然后为每个国家/地区和start_date选择最畅销的1000个商品.在这里,商品的等级和销售次数都是必需的.

Here, I want to find item's are bought by how many unique users and then pick top 1000 most sold item for each country and start_date. Here, both rank and number of times item sold is required.

预期会有以下输出

itemid     country   sold_count   start_date
itemid1    country1   2           2018-07-25 00:00:00
itemid2    country2   2           2018-07-25 00:00:00
itemid1    country2   1           2018-07-25 00:00:00
itemid2    country1   1           2018-07-25 00:00:00
itemid3    country1   1           2018-07-25 00:00:00

我正在尝试实现等级函数,但没有得到预期的结果.

I am trying to implement rank function but I am not getting expected result.

我正在尝试跟踪查询,

  select itemid, start_date, Rank() over (partition by itemid order by 
  count(distinct(userid)) desc) as rank1
  from table_name 
  group by item_id, start_date
  order by rank1 desc;

此外,我想有一个列,用于按国家和起始日期对不合法的用户ID购买的item_id组进行计数.在上面的查询中,我忽略了country列以简化查询.

Also, I want to have a column for count of unqiue userid bought item_id group by country and start_date. In the above query, I have ignored country column to simplify the query.

请帮助我.

推荐答案

如果我认为版本"表示国家",那么我认为你想要:

If I assume that "version" means "country", then I think you want:

select *
from (select itemid, country, start_date, count(distinct userid) as num_users,
             row_number() over (partition by country, start_date 
                                order by count(distinct userid) desc
                               ) as seqnum
      from table_name 
      group by item_id, country, start_date
     ) x
where seqnum <= 1000

这篇关于从表中查找排名前1000的条目以及计数和排名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆