查找某年获奖次数最多的电影-代码重复 [英] Find movies with highest number of awards in certain year - code duplication

查看:135
本文介绍了查找某年获奖次数最多的电影-代码重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试编写一个查询(PostgreSQL)以获取 2012年获奖次数最多的电影。

I am trying to write a query (PostgreSQL) to get "Movies with highest number of awards in year 2012."

我有以下表格:

CREATE TABLE Award(
    ID_AWARD bigserial CONSTRAINT Award_pk PRIMARY KEY,
    award_name VARCHAR(90),
    category VARCHAR(90),
    award_year integer,
    CONSTRAINT award_unique UNIQUE (award_name, category, award_year));

CREATE TABLE AwardWinner(
    ID_AWARD integer,
    ID_ACTOR integer,
    ID_MOVIE integer,
    CONSTRAINT AwardWinner_pk PRIMARY KEY (ID_AWARD));

然后我写了以下查询,给出了正确的结果,但是我有很多代码重复

And I written following query, which gives correct results, but there's is quite a lot of code duplication I think.

select * from 
(select id_movie, count(id_movie) as awards 
from Award natural join awardwinner 
where award_year = 2012 group by id_movie) as SUB
where awards = (select max(count) from 
(select id_movie, count(id_movie) 
from Award natural join awardwinner 
where award_year = 2012 group by id_movie) as SUB2);

所以 SUB SUB2 是完全相同的子查询。有更好的方法吗?

So SUB and SUB2 are exactly the same subquery. Is there a better way to do this?

推荐答案

获取所有获奖影片



Get all winning movies

SELECT id_movie, awards
FROM  (
   SELECT aw.id_movie, count(*) AS awards
         ,rank() OVER (ORDER BY count(aw.id_movie) DESC) AS rnk
   FROM   award       a
   JOIN   awardwinner aw USING (id_award)
   WHERE  a.award_year = 2012
   GROUP  BY aw.id_movie
   ) sub
WHERE  rnk = 1;



要点




  • 这应该比到目前为止的建议更简单,更快捷。使用 EXPLAIN ANALYZE 进行测试。

    在某些情况下,CTE有助于避免代码重复。但是现在还不行:子查询可以很好地完成工作,而且通常更快。

    There are cases where CTEs are instrumental to avoid code duplication. But not in this time: a subquery does the job just fine and is usually faster.

    您可以在同一查询上的聚合函数上运行窗口函数水平。这就是它起作用的原因:

    You can run a window function OVER an aggregate function on the same query level. That's why this works:

    rank() OVER (ORDER BY count(aw.id_movie) DESC) AS rnk
    


  • 我建议在JOIN条件下使用显式列名,而不要使用 NATURAL JOIN ,如果以后更改/向基础表中添加列,则很容易损坏。

    使用使用

  • I'd suggest to use explicit column names in the JOIN condition instead of NATURAL JOIN, which is prone to breakage if you later change / add columns to the underlying tables.
    The JOIN condition with USING is almost as short, but doesn't break as easily.

    因为几乎是短的,但是却不那么容易破解。因为 id_movie 不能为NULL (由JOIN条件以及pk的一部分决定),使用 count(*)较短,但速度稍快。结果相同。

    Since id_movie cannot be NULL (ruled out by the JOIN condition and also part of the pk) it is shorter ans slightly faster to use count(*) instead. Same result.

    更快,但是,如果您只需要一个赢家:

    Shorter and faster, yet, if you only need one winner:

    SELECT aw.id_movie, count(*) AS awards
    FROM   award       a
    JOIN   awardwinner aw USING (id_award)
    WHERE  a.award_year = 2012
    GROUP  BY 1
    ORDER  BY 2 DESC, 1 -- as tie breaker
    LIMIT  1
    

    使用位置引用( 1 2 )作为速记。

    我向 id_movie 添加了 ORDER BY 作为决胜局,以防多部电影获胜。

    Using positional references (1, 2) here as shorthand.
    I added id_movie to ORDER BY as tie breaker in case multiple movies should qualify for the win.

    这篇关于查找某年获奖次数最多的电影-代码重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆