跨多个列的BigQuery中的前N个结果 [英] Top N results in BigQuery across multiple columns
本文介绍了跨多个列的BigQuery中的前N个结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有以下记录:
id studio movie
1 fox avatar
2 paramount transformers
etc.
我想查询按标题名称排序的前2个制片厂,其中前3个电影按字母顺序排序.结果看起来像这样:
And I want to get a query of the top 2 studios by number of titles and within that, the first 3 movies, sorted alphabetically. The results would look something like this:
studio (top 2 by title cnt) title (top 3 alphabetically)
fox avatar
fox avatar2
fox avatar3
sony ace in the hole
sony antonio
sony spider-man
我将如何查询以获取此信息?到目前为止,我有类似的内容,但是我不确定最后如何进行排序:
How would I do a query to get this? So far I have something like this, but I'm not sure how to do the sort at the end:
select * from `table` where studio in (
SELECT studio FROM `table` group by studio order by count(*) desc limit 3
)
推荐答案
您将需要使用窗口函数(如ROW_NUMBER
)和聚合的某种组合.
You're going to need to use some combination of window functions (like ROW_NUMBER
) and aggregation.
这是一种可能的方法(我组成了表标识符,因此您必须插入自己的表标识符):
Here is one possible approach (I made up the table identifiers, so you'll have to insert your own):
WITH studio_counts AS
(
SELECT
studio
,ROW_NUMBER() OVER(ORDER BY COUNT(studio) DESC) As rownum
FROM
project.dataset.movies
GROUP BY
studio
)
SELECT
mc.studio
,mc.movie_title
FROM
(
SELECT
m.studio
,m.movie_title
,ROW_NUMBER() OVER(PARTITION BY m.studio ORDER BY m.movie_title) AS rownum2
FROM
studio_counts AS sc
INNER JOIN project.dataset.movies AS m
ON sc.studio = m.studio
WHERE
sc.rownum < 3
) AS mc
WHERE
mc.rownum2 < 4
这篇关于跨多个列的BigQuery中的前N个结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文