哪些DBMS允许按照select子句中不存在的属性进行排序? [英] Which DBMS's allow an order by of an attribute, that is not present in the select clause?

查看:124
本文介绍了哪些DBMS允许按照select子句中不存在的属性进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们假设我有一个名为的汽车与2列的表: CarName BrandName



现在我要执行这个查询:

 从Cars 
中选择CarName
订单by BrandName

可以看到,我想返回一个列,按列排序,不在查询的select部分中。



基本(不是优化)sql命令的执行顺序是:其中 group by 具有选择



发生的问题是,BrandName不是执行select命令后留下的一部分。


$ b $我已经在书籍,谷歌和Stackoverflow搜索了这个,但到目前为止,我只发现了几个SO评论,如我知道数据库系统不允许,但我不remeber哪一个。



所以我的问题是:

1)标准 SQL-92 或SQL99说这个。

2)哪些数据库允许查询,哪些不是?



(背景:有几个学生问这个,我想给他们最好的答案)



编辑:

- 成功测试了 Microsoft SQL Server 2012

解决方案

您的查询是完全合法的语法,您可以按列排列不存在于选择中。





如果您需要有关法律排序的完整规范,那么在SQL Standard 2003中,它有一个长列表(02-Foundation,第415页,第7.13节<查询表达式>,第28部分)。这确认您的查询是合法的语法。



我认为您的混淆可能是由于群组中不存在的列选择和/或排序而导致的,或列不在选择时使用不同的。



两者都有同样的根本问题,MySQL是唯一一个我认识的允许。

$ b问题是这样,当使用group by或distinct时,不需要任何不包含任何列的所有列,因此如果它们在行中间具有多个不同的值并不重要,因为它们永远不会需要的想象一下这个简单的数据集:

  ID | Column1 | Column2 | 
---- | --------- + ---------- |
1 | A | X |
2 | A | Z |
3 | B | Y |

如果你写:

  SELECT DISTINCT Column1 
FROM T;

您将获得

  Column1 
---------
A
B

如果然后添加 ORDER BY Column2 ,那么您使用哪两个column2来排序A,X或Z?对于如何选择column2的值不是确定性的。



同样适用于选择不在组中的列。为了简化事情,只想象前一张表的前两行:

  ID | Column1 | Column2 | 
---- | --------- + ---------- |
1 | A | X |
2 | A | Z |

在MySQL中,您可以写

  SELECT ID,Column1,Column2 
FROM T
GROUP BY Column1;

这实际上打破了SQL标准,但它在MySQL中起作用,确定性,结果:

  ID | Column1 | Column2 | 
---- | --------- + ---------- |
1 | A | X |

没有或多或少比

  ID | Column1 | Column2 | 
---- | --------- + ---------- |
2 | A | Y |

所以你说的是为每个不同的价值 Column1 ,这两个结果集都满足,那么你怎么知道你会得到哪一个?那么你没有,似乎是一个相当流行的误解,你可以添加和 ORDER BY 子句来影响结果,所以例如下面的查询:

  SELECT ID,Column1,Column2 
FROM T
GROUP BY Column1
ORDER BY ID DESC;

确保您得到以下结果:

  ID | Column1 | Column2 | 
---- | --------- + ---------- |
2 | A | Y |

因为 ORDER BY ID DESC 但是这不是真的(如此处所示)。



MySQL文档状态:


服务器可以自由选择每个组中的任何值,因此除非它们相同,所选择的值为不定。此外,通过添加ORDER BY子句,不能影响来自每个组的值的选择。


所以即使你有一个订单在每个组中选择一行之后,这一行不适用,而这一行是非确定性的。



SQL标准确实允许选择列表中的列包含在GROUP BY或聚合函数中,但这些列必须在功能上依赖于GROUP BY中的列。从SQL-2003-Standard(5WD-02-Foundation-2003-09 - 第346页) - http: /www.wiscorp.com/sql_2003_standard.zip


15)如果T是分组表,则让G成为在每个<值表达式>中的集合列的集合。在< select list>中包含
引用T列的每一列引用应引用一些列C,
在功能上取决于G,或者应包含在< set函数规范>的聚合参数中。
的聚合查询是QS。


例如,示例表中的ID是PRIMARY KEY,所以我们知道在表中是唯一的,因此以下查询符合SQL标准,并且将在MySQL中运行,并且在许多DBMS中失败(在编写Postgresql时,我知道最接近正确实现标准的DBMS - 此处的示例):

  SELECT ID,Column1,Column2 
FROM T
GROUP BY ID;

由于ID对于每一行都是唯一的,所以只能有一个值对于每个ID,Column1 ,一个值 Column2 对于每行将返回的内容没有歧义。


Let's assume I have a table called Cars with 2 columns: CarName, BrandName

Now I want to execute this query:

select CarName
from Cars
order by BrandName

As you can see, I'd like to return a list, which is sorted by a column, that is not present in the select part of the query.

The basic (not optimized) execution sequence of sql commands is: from, where, group by, having, select, order by.

The occuring problem is, that BrandName isn't part of what is left after the select command has been executed.

I've searched for this in books, on google and on Stackoverflow, but so far I've only found several SO comments like "I know of database system that don't allow it, but I don't remeber which one".

So my questions are:
1) What do the standards SQL-92 or SQL99 say about this.
2) Which databases allow this query and which don't?

(Background: A couple of students asked this, and I want to give them the best answer possible)

EDIT:
- Successfully tested for Microsoft SQL Server 2012

解决方案

Your query is perfectly legal syntax, you can order by columns that are not present in the select.

If you need the full specs about legal ordering, in the SQL Standard 2003 it has a long list of statements about what the order by should and shouldn't contain, (02-Foundation, page 415, section 7.13 <Query expression>, sub part 28). This confirms that your query is legal syntax.

I think your confusion could be arising from selecting, and/or ordering by columns not present in the group by, or ordering by columns not in the select when using distinct.

Both have the same fundamental problem, and MySQL is the only one to my knowledge that allows either.

The problem is this, that when using group by or distinct, any columns not contained in either are not needed, so it doesn't matter if they have multiple different values across rows because they are never needed. Imagine this simple data set:

ID  | Column1 | Column2  |
----|---------+----------|
1   |    A    |    X     |
2   |    A    |    Z     |
3   |    B    |    Y     |

If you write:

SELECT  DISTINCT Column1
FROM    T;

You would get

 Column1 
---------
     A   
     B   

If you then add ORDER BY Column2, which of the two column2's would your use to order A by, X or Z? It is not deterministic as to how to choose a value for column2.

The same applies to selecting columns not in the group by. To simplify things just imagine the first two rows of the previous table:

ID  | Column1 | Column2  |
----|---------+----------|
1   |    A    |    X     |
2   |    A    |    Z     |

In MySQL you can write

SELECT  ID, Column1, Column2
FROM    T
GROUP BY Column1;

This actually breaks the SQL Standard, but it works in MySQL, however the trouble is it is non-deterministic, the result:

ID  | Column1 | Column2  |
----|---------+----------|
1   |    A    |    X     |

Is no more or less correct than

ID  | Column1 | Column2  |  
----|---------+----------|
2   |    A    |    Y     |

So what you are saying is give me one row for each distinct value of Column1, which both results sets satisfy, so how do you know which one you will get? Well you don't, it seems to be a fairly popular misconception that you can add and ORDER BY clause to influence the results, so for example the following query:

SELECT  ID, Column1, Column2
FROM    T
GROUP BY Column1
ORDER BY ID DESC;

Would ensure that you get the following result:

ID  | Column1 | Column2  |  
----|---------+----------|
2   |    A    |    Y     |

because of the ORDER BY ID DESC, however this is not true (as demonstrated here).

The MySQL documents state:

The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause.

So even though you have an order by this does not apply until after one row per group has been selected, and this one row is non-determistic.

The SQL-Standard does allow columns in the select list not contained in the GROUP BY or an aggregate function, however these columns must be functionally dependant on a column in the GROUP BY. From the SQL-2003-Standard (5WD-02-Foundation-2003-09 - page 346) - http://www.wiscorp.com/sql_2003_standard.zip

15) If T is a grouped table, then let G be the set of grouping columns of T. In each <value expression> contained in <select list> , each column reference that references a column of T shall reference some column C that is functionally dependent on G or shall be contained in an aggregated argument of a <set function specification> whose aggregation query is QS.

For example, ID in the sample table is the PRIMARY KEY, so we know it is unique in the table, so the following query conforms to the SQL standard and would run in MySQL and fail in many DBMS currently (At the time of writing Postgresql is the closest DBMS I know of to correctly implementing the standard - Example here):

SELECT  ID, Column1, Column2
FROM    T
GROUP BY ID;

Since ID is unique for each row, there can only be one value of Column1 for each ID, one value of Column2 there is no ambiguity about what to return for each row.

这篇关于哪些DBMS允许按照select子句中不存在的属性进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆