sql server如何对您的数据进行排序? [英] How does sql server sort your data?

查看:597
本文介绍了sql server如何对您的数据进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道sql server如何对数据进行排序. 我注意到,如果我的表不包含"Id"列,而您选择的数据不带"ORDER BY",则sql server不会自动在主列上进行排序.

I was wondering how sql server sorts it's data. I noticed that if I have a table that doesn't contain the column "Id" and you select data without "ORDER BY" sql server doesn't automatically sort on the primary column.

有人知道sql服务器遵循什么规则对数据进行排序吗?

Does anyone know what rule sql server follows to sort it's data?

推荐答案

尽管很好奇如何解释您经常看到相同的顺序,但我想指出的是,这从来都不是一个好主意.依赖于由基础数据库引擎的特定实现引起的隐式顺序.换句话说,很高兴知道为什么,但是您永远不要依赖它.对于MS SQL,唯一可靠地按特定顺序传递行的是明确的ORDER BY子句.

Although it's good to wonder about how it might be explained that you often see the same order, I'd like to point out that it never a good idea to rely on implicit order caused by the particular implementation of the underlying database engine. In other words, its nice to know why, but you should never ever rely on it. For MS SQL, the only thing that reliably delivers the rows in a certain order, is an explicit ORDER BY clause.

不仅不同的RDMBS行为不同,而且由于更新(补丁),一个特定实例的行为也可能不同.不仅如此,甚至RDBMS软件的状态也可能会产生影响:热"数据库的行为与冷"数据库的行为不同,一个小表的行为与一个大表的行为不同.

Not only do different RDMBS-es behave differently, one particular instance may behave differently due to an update (patch). Not only that, even the state of the RDBMS software may have an impact: a "warm" database behaves differently than a "cold" one, a small table behaves differently than a large one.

即使您具有有关实现的背景信息(例如:存在聚集索引,因此很可能将按聚集索引的顺序返回数据"),也总是有可能存在另一种机制您不知道会导致以不同的顺序返回行(例如:如果另一个会话只是使用显式的ORDER BY进行了全表扫描,则结果集可能已经被缓存;随后的全扫描将尝试执行以下操作:从缓存中返回行"; ex2:可以通过对数据进行排序来实现GROUP BY,从而影响返回行的顺序"; ex3:如果所选列都在已经缓存的二级索引中在内存中,引擎可能会扫描二级索引而不是表,最有可能按二级索引的顺序返回行."

Even if you have background information about the implementation (ex: "there is a clustered index, thus it is likely the data will be returned by order of the clustered index"), there is always a possibility that there is another mechanism you don't know about that causes the rows to be returned in a different order (ex1: "if another session just did a full table scan with an explicit ORDER BY the resultset may have been cached; a subsequent full scan will attempt to return the rows from the cache"; ex2: "a GROUP BY may be implemented by sorting the data, thus impacting the order the rows are returned"; ex3: "If the selected columns are all in a secondary index that is already cached in memory, the engine may scan the secondary index instead of the table, most likely returning the rows by order of the secondary index").

这是一个非常简单的测试,它说明了我的一些观点.

Here's a very simple test that illustrates some of my points.

首先,启动SQL Server(我正在使用2008).创建此表:

First, startup SQL server (I'm using 2008). Create this table:

create table test_order (
    id int not null identity(1,1) primary key
,   name varchar(10) not null 
)

检查表并见证创建了聚集索引以支持id列上的primary key.例如,在sql server management studio中,您可以使用树形视图并导航到表下方的indexs文件夹.在那里,您应该看到一个索引,其名称类似于:PK__test_ord__3213E83F03317E3D (Clustered)

Examine the table and witness that a clusted index was created to support the primary key on the id column. For example, in sql server management studio, you can use the tree view and navigate to the indexes folder beneath your table. There you should see one index, with a name like: PK__test_ord__3213E83F03317E3D (Clustered)

使用以下语句插入第一行:

Insert the first row with this statement:

insert into test_order(name)
select RAND()

通过重复此语句16次来插入更多行:

Insert more rows by repeating this statement 16 times:

insert into test_order(name)
select RAND()
from   test_order

您现在应该有65536行:

You should now have 65536 rows:

select COUNT(*) 
from   test_order

现在,选择所有行而不使用以下命令:

Now, select all rows without using an order by:

select *
from   test_order

很有可能,结果将按主键的顺序返回(尽管不能保证).这是我得到的结果(实际上是按主键的顺序):

Most likely, the results will be returned by order of the primary key (although there is no guarantee). Here's the result I got (which is indeed by order of primary key):

#      id    name
1      1     0.605831
2      2     0.517251
3      3     0.52326
.      .     .......
65536  65536 0.902214

(#不是列,而是结果中行的顺序位置)

(the # is not a column but the ordinal position of the row in the result)

现在,在name列上创建二级索引:

Now, create a secondary index on the name column:

create index idx_name on test_order(name)

选择所有行,但仅检索name列:

Select all rows, but retrieve only the name column:

select name
from   test_order

最有可能按二级索引idx_name的顺序返回结果,因为仅通过扫描索引即可解决查询(i.o.w. idx_name covering 索引).这是我得到的结果,确实是按name的顺序.

Most likely the results will be returned by order of the secondary index idx_name, since the query can be resolved by only scanning the index (i.o.w. idx_name is a covering index). Here's the result I got, which is indeed by order of name.

#      name
1      0.0185732
2      0.0185732
.      .........
65536  0.981894

现在,再次选择所有列和所有行:

Now, select all columns and all rows again:

select * 
from test_order

这是我得到的结果:

#      id    name
1      17    0.0185732
2      18    0.0185732
3      19    0.0185732
...    ..    .........

如您所见,

与我们第一次运行此查询完全不同. (看来行是由二级索引排序的,但是我没有解释为什么应该这样).

as you can see, quite different from the first time we ran this query. (It looks like the rows are ordered by the secondary index, but I don't have an explanation why that should be so).

无论如何,最重要的是-不要依赖隐式顺序.您可以考虑为什么可以观察到特定顺序的解释,但是即使如此,如果您不了解具体的实现和运行时状态,就无法始终对其进行预测(就像在后一种情况中一样).

Anyway, the bottom line is - don't rely on implicit order. You can think of explanations why a particular order can be observed, but even then you can't always predict it (like in the latter case) without having intimate knowledge of implementation and runtime state.

这篇关于sql server如何对您的数据进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆