优化慢排名查询 [英] Optimize slow ranking query

查看:84
本文介绍了优化慢排名查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对查询进行永久优化(该查询本身可以工作,但我知道它很糟糕,并且我只用了很多记录就尝试了一次,并且给出了超时).

I need to optimize a query for a ranking that is taking forever (the query itself works, but I know it's awful and I've just tried it with a good number of records and it gives a timeout).

我将简要解释该模型.我有3个表:player,team和player_team.我有一些球员,可以属于一支球队.听起来很明显,玩家存储在玩家表和团队中的团队中.在我的应用中,每个玩家都可以随时切换团队,并且必须保存日志.但是,在给定的时间,一个玩家被视为仅属于一个团队.当前一名球员是他加入的最后一支球队.

I'll briefly explain the model. I have 3 tables: player, team and player_team. I have players, that can belong to a team. Obvious as it sounds, players are stored in the player table and teams in team. In my app, each player can switch teams at any time, and a log has to be mantained. However, a player is considered to belong to only one team at a given time. The current team of a player is the last one he's joined.

我认为球员和球队的结构无关紧要.我每个都有一个id列PK.在player_team中,我有:

The structure of player and team is not relevant, I think. I have an id column PK in each. In player_team I have:

id          (PK)
player_id   (FK -> player.id)
team_id     (FK -> team.id)

现在,每个团队都为加入的每个球员分配一个分数.所以,现在,我想获得球员人数最多的前N支球队的排名.

Now, each team is assigned a point for each player that has joined. So, now, I want to get a ranking of the first N teams with the biggest number of players.

我的第一个想法是首先从player_team中获取当前玩家(这是每个玩家的最高记录;该记录必须是该玩家的当前团队).我没有找到一种简单的方法(尝试GROUP BY player_team.player_id HAVING player_team.id = MAX(player_team.id),但这并没有解决问题.

My first idea was to get first the current players from player_team (that is one record top for each player; this record must be the player's current team). I failed to find a simple way to do it (tried GROUP BY player_team.player_id HAVING player_team.id = MAX(player_team.id), but that didn't cut it.

我尝试了许多不起作用的查询,但是设法使它起作用.

I tried a number of querys that didn't work, but managed to get this working.

SELECT 
    COUNT(*) AS total,
    pt.team_id,
    p.facebook_uid AS owner_uid, 
    t.color 
FROM 
    player_team pt 
JOIN player p ON (p.id = pt.player_id)  
JOIN team t ON (t.id = pt.team_id) 
WHERE 
    pt.id IN (
        SELECT max(J.id) 
        FROM player_team J 
        GROUP BY J.player_id
    )  

GROUP BY 
    pt.team_id 
ORDER BY 
    total DESC 
LIMIT 50            

正如我所说,它可以工作,但看起来非常糟糕,而且性能较差,所以我敢肯定必须有更好的方法.任何人都有优化它的想法吗?

As I said, it works but looks very bad and performs worse, so I'm sure there must be a better way to go. Anyone has any ideas for optimizing this?

顺便说一下,我正在使用mysql.

I'm using mysql, by the way.

预先感谢

添加说明. (抱歉,不确定如何正确格式化)

Adding the explain. (Sorry, not sure how to format it properly)

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   PRIMARY     t   ALL     PRIMARY     NULL    NULL    NULL    5000    Using temporary; Using filesort
1   PRIMARY     pt  ref     FKplayer_pt77082,FKplayer_pt265938,new_index    FKplayer_pt77082    4   t.id    30  Using where
1   PRIMARY     p   eq_ref  PRIMARY     PRIMARY     4   pt.player_id    1
2   DEPENDENT SUBQUERY  J   index   NULL    new_index   8   NULL    150000  Using index

推荐答案

尝试一下:

SELECT  t.*, cnt
FROM    (
        SELECT  team_id, COUNT(*) AS cnt
        FROM    (
                SELECT  player_id, MAX(id) AS mid
                FROM    player_team
                GROUP BY
                        player_id
                ) q
        JOIN    player_team pt
        ON      pt.id = q.mid
        GROUP BY
                team_id
        ) q2
JOIN    team t
ON      t.id = q2.team_id
ORDER BY
        cnt DESC
LIMIT 50

player_team (player_id, id)上创建索引(按此顺序)以使其快速运行.

Create an index on player_team (player_id, id) (in this order) for this to work fast.

这篇关于优化慢排名查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆