MySQL GROUP BY行为(使用派生表的时候) [英] MySQL GROUP BY behavior (when using a derived table with order by)

查看:722
本文介绍了MySQL GROUP BY行为(使用派生表的时候)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于mysql不执行单值规则(请参阅: https ://stackoverflow.com/a/1646121/1688441 )通过保证哪些行值将被显示的顺序执行派生表?这是针对不在聚合函数中的列,而不是由中的组。



我正在查看这个问题( MySQL GROUP BY行为),然后回答问题(https://stackoverflow.com/a/24653572/1688441 )。



I不同意接受的答案,但意识到可能改进后的答案是:

pre $ SELECT * FROM
(SELECT * FROM tbl order by timestamp)as tb2
GROUP BY userID; b


$ b

http://sqlfiddle.com/#!2/4b475/18



这是否正确,或者mysql是否仍然决定任意地显示哪些行值?

解决方案

此查询:

  SELECT * 
FROM(SELECT * FROM tbl order by timestamp)as tb2
GROUP BY userID;

依赖扩展的MySQL组,这是记录在这里。您特别依赖于所有列都来自同一行并遇到第一个列的事实。 MySQL 专门警告不要做这样的假设:


MySQL扩展了GROUP BY的用法,以便选择列表可以请参阅未在GROUP BY子句中命名的
非聚合列。这意味着
前面的查询在MySQL中是合法的。您可以使用此功能
来避免不必要的列排序和
分组,以获得更好的性能。但是,这对于每个
非集合列中未在GROUP BY中命名的所有值对于每个
组都是相同的都是有用的。服务器可以自由选择每组中的任何值,因此
除非相同,否则选择的值是不确定

所以,你不能依赖这种行为。解决问题很容易。下面是一个查询示例:

pre $ $ $
$ t

其中不存在(select 1 from tbl t2 where t2.userid = t.userid and t2.timestamp> t.timestamp)

对于 tbl(userid,timestamp)的索引,这可能甚至会更快。 MySQL在优化聚合方面做得很差。


Since mysql does not enforce the Single-Value Rule (See: https://stackoverflow.com/a/1646121/1688441) does a derived table with an order by guarantee which row values will be displayed? This is for columns not in an aggregate function and not in the group by.

I was looking at the question (MySQL GROUP BY behavior) after having commented on and answered the question (https://stackoverflow.com/a/24653572/1688441) .

I don't agree with the accepted answer, but realized that a possible improved upon answer would be:

SELECT * FROM 
(SELECT * FROM tbl order by timestamp) as tb2
GROUP BY userID;

http://sqlfiddle.com/#!2/4b475/18

Is this correct though or will mysql still decide arbitrarily which row values will be displayed?

解决方案

This query:

SELECT *
FROM (SELECT * FROM tbl order by timestamp) as tb2
GROUP BY userID;

Relies on a MySQL group by extension, which is documented here. You are specifically relying on the fact that all the columns come from the same row and the first one encountered. MySQL specifically warns against making this assumption:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

So, you cannot depend on this behavior. It is easy enough to work around. Here is an example query:

select t.*
from tbl t
where not exists (select 1 from tbl t2 where t2.userid = t.userid and t2.timestamp > t.timestamp)

With an index on tbl(userid, timestamp) this may even work faster. MySQL does a notoriously poor job of optimizing aggregations.

这篇关于MySQL GROUP BY行为(使用派生表的时候)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆