查找某年获奖次数最多的电影-代码重复 [英] Find movies with highest number of awards in certain year - code duplication
问题描述
我正尝试编写一个查询(PostgreSQL)以获取 2012年获奖次数最多的电影。
I am trying to write a query (PostgreSQL) to get "Movies with highest number of awards in year 2012."
我有以下表格:
CREATE TABLE Award(
ID_AWARD bigserial CONSTRAINT Award_pk PRIMARY KEY,
award_name VARCHAR(90),
category VARCHAR(90),
award_year integer,
CONSTRAINT award_unique UNIQUE (award_name, category, award_year));
CREATE TABLE AwardWinner(
ID_AWARD integer,
ID_ACTOR integer,
ID_MOVIE integer,
CONSTRAINT AwardWinner_pk PRIMARY KEY (ID_AWARD));
然后我写了以下查询,给出了正确的结果,但是我有很多代码重复
And I written following query, which gives correct results, but there's is quite a lot of code duplication I think.
select * from
(select id_movie, count(id_movie) as awards
from Award natural join awardwinner
where award_year = 2012 group by id_movie) as SUB
where awards = (select max(count) from
(select id_movie, count(id_movie)
from Award natural join awardwinner
where award_year = 2012 group by id_movie) as SUB2);
所以 SUB
和 SUB2
是完全相同的子查询。有更好的方法吗?
So SUB
and SUB2
are exactly the same subquery. Is there a better way to do this?
推荐答案
获取所有获奖影片
Get all winning movies
SELECT id_movie, awards
FROM (
SELECT aw.id_movie, count(*) AS awards
,rank() OVER (ORDER BY count(aw.id_movie) DESC) AS rnk
FROM award a
JOIN awardwinner aw USING (id_award)
WHERE a.award_year = 2012
GROUP BY aw.id_movie
) sub
WHERE rnk = 1;
要点
-
这应该比到目前为止的建议更简单,更快捷。使用
EXPLAIN ANALYZE
进行测试。在某些情况下,CTE有助于避免代码重复。但是现在还不行:子查询可以很好地完成工作,而且通常更快。
There are cases where CTEs are instrumental to avoid code duplication. But not in this time: a subquery does the job just fine and is usually faster.
您可以在同一查询上的聚合函数上运行窗口函数水平。这就是它起作用的原因:
You can run a window function OVER an aggregate function on the same query level. That's why this works:
rank() OVER (ORDER BY count(aw.id_movie) DESC) AS rnk
-
我建议在JOIN条件下使用显式列名,而不要使用
NATURAL JOIN
,如果以后更改/向基础表中添加列,则很容易损坏。
使用使用
I'd suggest to use explicit column names in the JOIN condition instead of
NATURAL JOIN
, which is prone to breakage if you later change / add columns to the underlying tables.
The JOIN condition withUSING
is almost as short, but doesn't break as easily.因为几乎是短的,但是却不那么容易破解。因为
id_movie
不能为NULL (由JOIN条件以及pk的一部分决定),使用count(*)
较短,但速度稍快。结果相同。Since
id_movie
cannot be NULL (ruled out by the JOIN condition and also part of the pk) it is shorter ans slightly faster to usecount(*)
instead. Same result.更快,但是,如果您只需要一个赢家:
Shorter and faster, yet, if you only need one winner:
SELECT aw.id_movie, count(*) AS awards FROM award a JOIN awardwinner aw USING (id_award) WHERE a.award_year = 2012 GROUP BY 1 ORDER BY 2 DESC, 1 -- as tie breaker LIMIT 1
使用位置引用(
1
,2
)作为速记。
我向id_movie
添加了ORDER BY
作为决胜局,以防多部电影获胜。Using positional references (
1
,2
) here as shorthand.
I addedid_movie
toORDER BY
as tie breaker in case multiple movies should qualify for the win.这篇关于查找某年获奖次数最多的电影-代码重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!