如何优化MySQL查询(分组和顺序) [英] How to optimize MySQL query (group and order)

查看:180
本文介绍了如何优化MySQL查询(分组和顺序)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嘿,我有一个需要优化的查询.它可以工作,但是性能很好.

Hey all, I've got a query in need of optimizing. It works but its a dog, performance wise.

它看起来像这样:

SELECT  *
FROM    (
        SELECT  *
        FROM    views
        WHERE   user_id = '1'
        ORDER BY
                page DESC
        ) v
GROUP BY
        v.session

我正在跟踪查看不同页面的视图,并且我想知道每个会话的最高页面,以便了解他们点击了多远(需要从头到尾查看每个页面)在任何给定的会话中.

I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.

基本上,我想做的是在GROUP之前对结果进行排序.以上是实现的,成本很高.

Basically what I'm trying to do is ORDER the results before the GROUP. Which the above achieves, at significant cost.

任何人都可以用这个方法拍我的脑袋吗?谢谢大家!

Anyone who can slap me over the head with how to do this? Thanks guys!

更新:

说明:

"1" "PRIMARY"   "<derived2>"    "ALL"   \N  \N  \N  \N  "3545"  "Using temporary; Using filesort"

"2" "DERIVED"   "views" "index" \N  "page"  "5" \N  "196168"    "Using where"

架构:

ID       int(8) unsigned  (NULL)     NO      PRI     (NULL)   auto_increment  select,insert,update,references         
page     int(8)           (NULL)     YES     MUL     (NULL)                   select,insert,update,references         
user_id  int(8)           (NULL)     YES             (NULL)                   select,insert,update,references         
session  int(8)           (NULL)     YES             (NULL)                   select,insert,update,references         
created  datetime         (NULL)     NO                                       select,insert,update,references       

索引信息:

views            0  PRIMARY              1  ID           A               196008    (NULL)  (NULL)          BTREE    

views            1  page                 1  page         A                  259    (NULL)  (NULL)  YES     BTREE 

推荐答案

我正在跟踪查看不同页面的视图,并且我想知道每个会话的最高页面,以便了解他们点击了多远(需要从头到尾查看每个页面)在任何给定的会话中.

I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.

在分组之前订购是一种非常不可靠的方法.

Ordering before grouping is a highly unreliable way to do this.

MySQL扩展了GROUP BY语法:您可以在SELECTORDER BY子句中使用未分组和未聚合的字段.

MySQL extends GROUP BY syntax: you can use ungrouped and unaggregated fields in SELECT and ORDER BY clauses.

在这种情况下,每个session输出一个随机值page.

In this case, a random value of page is output per each session.

文档 明确指出,您永远不应对将其确切设置为哪个值做任何假设:

Documentation explicitly states that you should never make any assumptions on which value exactly will it be:

如果您从GROUP BY部分省略的列在组中不是恒定的,请不要使用此功能.服务器可以自由地从组中返回任何值,因此除非所有值都相同,否则结果是不确定的.

Do not use this feature if the columns you omit from the GROUP BY part are not constant in the group. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.

但是,实际上,会返回扫描的第一行中的值.

However, in practice, the values from the first row scanned are returned.

由于您在子查询中使用了ORDER BY page DESC,因此该行恰好是每个会话中最大page的行.

Since you are using an ORDER BY page DESC in your subquery, this row happens to be the rows with a maximal page per session.

您不应该依赖它,因为此行为没有记录,并且如果在下一版本中将返回其他行,则不会将其视为错误.

You shouldn't rely on it, since this behaviour is undocumented and if some other row will be returned in next version, it will not be considered a bug.

但是您甚至不必做这些讨厌的把戏.

But you don't even have to do such nasty tricks.

只需使用聚合函数:

SELECT  MAX(page)
FROM    views
WHERE   user_id = '1'
GROUP BY
        session

这是记录在案的干净方法,可用来做您想要的事情.

This is documented and clean way to do what you want.

(user_id, session, page)上创建一个复合索引,以使查询运行更快.

Create a composite index on (user_id, session, page) for the query to run faster.

如果您需要表中的所有列,而不仅是聚合的列,请使用以下语法:

If you need all columns from your table, not only the aggregated ones, use this syntax:

SELECT  v.*
FROM    (
        SELECT  DISTINCT user_id, session
        FROM    views
        ) vo
JOIN    views v
ON      v.id =
        (
        SELECT  id
        FROM    views vi
        WHERE   vi.user_id = vo.user_id
                AND vi.session = vo.session
        ORDER BY
                page DESC
        LIMIT 1
        )

这假定idviews上的PRIMARY KEY.

这篇关于如何优化MySQL查询(分组和顺序)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆