MySQL:具有汇总好奇心的GROUP BY [英] MySQL: Total GROUP BY WITH ROLLUP curiosity

查看:48
本文介绍了MySQL:具有汇总好奇心的GROUP BY的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个查询.其中一个对我有意义,另一个对我没有意义.第一个:

I have two queries. One of them makes sense to me, the other don't. First one:

SELECT gender AS 'Gender', count(*) AS '#'
    FROM registrations 
    GROUP BY gender WITH ROLLUP

这给了我这个

Gender       #
Female      20
Male        19
NULL        39

所以,我得到了计数和总数.我所期望的.下一个:

So, I get the count, and the total count. What I expected. Next one:

SELECT c.printable_name AS 'Country', count(*) AS '#' 
    FROM registrations r 
    INNER JOIN country c ON r.country = c.country_id 
    GROUP BY country WITH ROLLUP

Country         #
Denmark         9
Norway         10
Sweden         18
United States   1
Uzbekistan      1
Uzbekistan     39

相同的结果.但是为什么我要拿乌兹别克斯坦呢?

Same result. But why do I get Uzbekistan for the total??

推荐答案

但是为什么我要拿乌兹别克斯坦呢?

But why do I get Uzbekistan for the total??

因为您没有选择分组依据的项目.如果您说:

Because you're not SELECTing the item that you're GROUPing BY. If you said:

GROUP BY c.printable_name

您将获得预期的NULL.但是,您是按不同的列分组的,因此MySQL不知道printable_name会参与汇总组,并会在 all 注册的联接中从该列中选择任何旧值. (因此,您可能会看到乌兹别克斯坦以外的其他国家.)

You'd get the expected NULL. However you're grouping by a different column so MySQL doesn't know that printable_name is taking part in a rollup-group, and selects any old value from that column, in the join of all registrations. (So it is possible you will see other countries than Uzbekistan.)

这是一个更广泛的问题的一部分,因为MySQL允许您在GROUP BY查询中选择SELECT.例如,您可以说:

This is part of a wider problem with MySQL being permissive on what you can SELECT in a GROUP BY query. For example, you can say:

SELECT gender FROM registrations GROUP BY country;

并且,即使国家和性别之间没有直接的因果联系(也称为功能依赖"),MySQL也会很乐意从每个国家中选择一个性别值进行注册.其他DBMS将拒绝保证每个国家只有一个性别.(*)

and MySQL will happily pick one of the gender values for a registration from each country, even though there is no direct causal link (aka "functional dependency") between country and gender. Other DBMSs will refuse the above command on the grounds that there isn't guaranteed to be one gender per country.(*)

现在,这个:

SELECT c.printable_name AS 'Country', count(*) AS '#' 
FROM registrations r 
INNER JOIN country c ON r.country = c.country_id 
GROUP BY country

可以,因为r.country和c.printable_name之间存在功能依赖关系(假设您已经正确地将country_id描述为主键).

is OK, because there's a functional dependency between r.country and c.printable_name (assuming you have correctly described your country_id as a PRIMARY KEY).

但是,MySQL的WITH ROLLUP扩展在其工作方式上有点不足.在最后的汇总行阶段,它将遍历整个预分组结果集以获取其值,然后 then 将分组依据列设置为NULL. 它也不会使该列上具有功能依赖项的其他列也为空.可能应该这样做,但是MySQL当前并不真正了解有关功能依赖项的全部内容.

However MySQL's WITH ROLLUP extension is a bit of a hack in the way it works. On the rollup row stage at the end, it runs over the entire pre-grouping result set to grab its values, and then sets the group-by column to NULL. It doesn't also null other columns that have a functional dependency on that column. It probably should, but MySQL currently doesn't really understand the whole thing about functional dependencies.

因此,如果选择c.printable_name,它将显示随机选择的任何一个国家名称值;如果选择c.country_id,它将显示随机选择的任何一个国家ID —即使c.country_id是加入条件,因此必须与r.country相同,为NULL!

So if you select c.printable_name it will show you whichever country name value it randomly picked, and if you select c.country_id it will show you whichever country ID it randomly picked — even though c.country_id is the join criterion, so must be the same as r.country, which is NULL!

解决该问题的方法是:

  • 按printable_name分组;如果printable_names是唯一的,则应为OK,或者
  • 选择"r.country"以及printable_name,然后检查其是否为NULL,或者
  • 忘记WITH ROLLUP并单独查询最终金额.这会稍慢一些,但也将符合ANSI SQL-92,因此您的应用程序可以在其他数据库上运行.

(**:MySQL具有SQL_MODE选项 ONLY_FULL_GROUP_BY 本来可以解决此问题,但是它的作用范围太广,只能让您从GROUP BY中选择列,而不能选择对GROUP BY具有功能依赖性的列.因此,这样会使有效查询失败好,一般来说就没用了.)

(*: MySQL has an SQL_MODE option ONLY_FULL_GROUP_BY that is supposed to address this issue, but it goes much too far and only lets you select columns from the GROUP BY, not columns that have a functional dependency on the GROUP BY. So it will make valid queries fail as well, making it generally useless.)

这篇关于MySQL:具有汇总好奇心的GROUP BY的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆