为什么MySQL允许你按未选择的列进行分组 [英] Why does MySQL allow you to group by columns that are not selected

查看:214
本文介绍了为什么MySQL允许你按未选择的列进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读一本关于SQL的书(Sams在10分钟内教你自己的SQL),尽管它的标题非常好。然而,关于群组的章节让我感到困惑

分组数据是一个简单的过程,选定的列(查询中的SELECT关键字后面的列列表)是可以在GROUP
BY子句中引用的列如果在SELECT语句中找不到列,它不能在
GROUP BY子句中使用,如果您考虑它如果数据没有显示,你如何在
报告中分组数据?



当我在MySQL中运行这条语句时它是如何工作的?

 从EMPLOYEE_PAY_TBL 
group中选择EMP_ID,SALARY
by BONUS;


解决方案

因为这本书是错误的。



中的列与中的列只有一个关系select 选择,没有聚合函数,那么它(或者它所在的表达式)需要位于组中, / code>语句。 MySQL实际上放宽了这种情况。



这甚至有用。例如,如果要从表中为每个组选择具有最高ID的行,则编写查询的一种方式是:

 从表t 
中选择t。*
其中t.id in(从表t中选择max(id)

group by
);

(注意:还有其他方法可以编写这样的查询,这只是一个例子。)



编辑:



您提示的查询:

 从EMPLOYEE_PAY_TBL 
group by BONUS中选择EMP_ID,SALARY
;

可以在MySQL中工作,但可能不在任何其他数据库中(除非 BONUS 恰好是表上名字不明的主键,但那是另一回事)。它会为 BONUS 的每个值产生一行。对于每一行,它将从该组中的行中获得任意 EMP_ID SALARY 。该文档实际上表示不确定,但我认为随心所欲更容易理解。



你应该知道这类查询的真正含义是根本不使用它。 SELECT (即没有聚合函数)中的所有裸列应位于 GROUP BY 。这在大多数数据库中是必需的。请注意,这是本书所说的反转。这样做没有问题:

 从EMPLOYEE_PAY_TBL中选择EMP_ID 
$ b $ group by EMP_ID,BONUS;

除了您可能会获得多行返回相同的 EMP_ID 无法区分它们。


I'm reading a book on SQL (Sams Teach Yourself SQL in 10 Minutes) and its quite good despite its title. However the chapter on group by confuses me

"Grouping data is a simple process. The selected columns (the column list following the SELECT keyword in a query) are the columns that can be referenced in the GROUP BY clause. If a column is not found in the SELECT statement, it cannot be used in the GROUP BY clause. This is logical if you think about it—how can you group data on a report if the data is not displayed? "

How come when I ran this statement in MySQL it works?

select EMP_ID, SALARY
from EMPLOYEE_PAY_TBL
group by BONUS;

解决方案

Because the book is wrong.

The columns in the group by have only one relationship to the columns in the select according to the ANSI standard. If a column is in the select, with no aggregation function, then it (or the expression it is in) needs to be in the group by statement. MySQL actually relaxes this condition.

This is even useful. For instance, if you want to select rows with the highest id for each group from a table, one way to write the query is:

select t.*
from table t
where t.id in (select max(id)
               from table t
               group by thegroup
              );

(Note: There are other ways to write such a query, this is just an example.)

EDIT:

The query that you are suggesting:

select EMP_ID, SALARY
from EMPLOYEE_PAY_TBL
group by BONUS;

would work in MySQL but probably not in any other database (unless BONUS happens to be a poorly named primary key on the table, but that is another matter). It will produce one row for each value of BONUS. For each row, it will get an arbitrary EMP_ID and SALARY from rows in that group. The documentation actually says "indeterminate", but I think arbitrary is easier to understand.

What you should really know about this type of query is simply not to use it. All the "bare" columns in the SELECT (that is, with no aggregation functions) should be in the GROUP BY. This is required in most databases. Note that this is the inverse of what the book says. There is no problem doing:

select EMP_ID
from EMPLOYEE_PAY_TBL
group by EMP_ID, BONUS;

Except that you might get multiple rows back for the same EMP_ID with no way to distinguish among them.

这篇关于为什么MySQL允许你按未选择的列进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆