如何优化MySQL查询(分组和顺序) [英] How to optimize MySQL query (group and order)
问题描述
嘿,我有一个需要优化的查询.它可以工作,但是性能很好.
Hey all, I've got a query in need of optimizing. It works but its a dog, performance wise.
它看起来像这样:
SELECT *
FROM (
SELECT *
FROM views
WHERE user_id = '1'
ORDER BY
page DESC
) v
GROUP BY
v.session
我正在跟踪查看不同页面的视图,并且我想知道每个会话的最高页面,以便了解他们点击了多远(需要从头到尾查看每个页面)在任何给定的会话中.
I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.
基本上,我想做的是在GROUP之前对结果进行排序.以上是实现的,成本很高.
Basically what I'm trying to do is ORDER the results before the GROUP. Which the above achieves, at significant cost.
任何人都可以用这个方法拍我的脑袋吗?谢谢大家!
Anyone who can slap me over the head with how to do this? Thanks guys!
更新:
说明:
"1" "PRIMARY" "<derived2>" "ALL" \N \N \N \N "3545" "Using temporary; Using filesort"
"2" "DERIVED" "views" "index" \N "page" "5" \N "196168" "Using where"
架构:
ID int(8) unsigned (NULL) NO PRI (NULL) auto_increment select,insert,update,references
page int(8) (NULL) YES MUL (NULL) select,insert,update,references
user_id int(8) (NULL) YES (NULL) select,insert,update,references
session int(8) (NULL) YES (NULL) select,insert,update,references
created datetime (NULL) NO select,insert,update,references
索引信息:
views 0 PRIMARY 1 ID A 196008 (NULL) (NULL) BTREE
views 1 page 1 page A 259 (NULL) (NULL) YES BTREE
推荐答案
我正在跟踪查看不同页面的视图,并且我想知道每个会话的最高页面,以便了解他们点击了多远(需要从头到尾查看每个页面)在任何给定的会话中.
I'm tracking views to different pages, and I want to know the highest page per session, in order to know how far they've clicked through (they're required to view every page all the way to the end) in any given session.
在分组之前订购是一种非常不可靠的方法.
Ordering before grouping is a highly unreliable way to do this.
MySQL
扩展了GROUP BY
语法:您可以在SELECT
和ORDER BY
子句中使用未分组和未聚合的字段.
MySQL
extends GROUP BY
syntax: you can use ungrouped and unaggregated fields in SELECT
and ORDER BY
clauses.
在这种情况下,每个session
输出一个随机值page
.
In this case, a random value of page
is output per each session
.
文档 明确指出,您永远不应对将其确切设置为哪个值做任何假设:
Documentation explicitly states that you should never make any assumptions on which value exactly will it be:
如果您从
GROUP BY
部分省略的列在组中不是恒定的,请不要使用此功能.服务器可以自由地从组中返回任何值,因此除非所有值都相同,否则结果是不确定的.
Do not use this feature if the columns you omit from the
GROUP BY
part are not constant in the group. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.
但是,实际上,会返回扫描的第一行中的值.
However, in practice, the values from the first row scanned are returned.
由于您在子查询中使用了ORDER BY page DESC
,因此该行恰好是每个会话中最大page
的行.
Since you are using an ORDER BY page DESC
in your subquery, this row happens to be the rows with a maximal page
per session.
您不应该依赖它,因为此行为没有记录,并且如果在下一版本中将返回其他行,则不会将其视为错误.
You shouldn't rely on it, since this behaviour is undocumented and if some other row will be returned in next version, it will not be considered a bug.
但是您甚至不必做这些讨厌的把戏.
But you don't even have to do such nasty tricks.
只需使用聚合函数:
SELECT MAX(page)
FROM views
WHERE user_id = '1'
GROUP BY
session
这是记录在案的干净方法,可用来做您想要的事情.
This is documented and clean way to do what you want.
在(user_id, session, page)
上创建一个复合索引,以使查询运行更快.
Create a composite index on (user_id, session, page)
for the query to run faster.
如果您需要表中的所有列,而不仅是聚合的列,请使用以下语法:
If you need all columns from your table, not only the aggregated ones, use this syntax:
SELECT v.*
FROM (
SELECT DISTINCT user_id, session
FROM views
) vo
JOIN views v
ON v.id =
(
SELECT id
FROM views vi
WHERE vi.user_id = vo.user_id
AND vi.session = vo.session
ORDER BY
page DESC
LIMIT 1
)
这假定id
是views
上的PRIMARY KEY
.
这篇关于如何优化MySQL查询(分组和顺序)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!