在SQL中基于集群索引和非集群索引优化查询? [英] Optimizing queries based on clustered and non-clustered indexes in SQL?

查看:184
本文介绍了在SQL中基于集群索引和非集群索引优化查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近一直在阅读聚集索引非聚集索引的工作原理。我用简单的术语理解(如果错误就纠正我):

I have been reading lately about how clustered index and non-clustered index works. My understanding in simple terms (correct me if wrong):

支持集群和<$ c的数据结构$ c>非聚集索引是 B-Tree

聚簇索引:根据索引列(或键)对数据进行物理排序。每个只能有一个聚集索引。如果在表创建期间未指定 index ,则 SQL 服务器将自动创建聚簇索引< 主键列上的/ code>

Clustered Index: physically sorts the data based on the index column (or key). you can have only one clustered Index per table. If no index is specified during table creation, SQL server will automatically create a clustered Index on the primary key column.

Q1 :自从数据是根据索引进行物理排序的,这里不需要额外的空间。它是否正确?那么当我删除我创建的索引时会发生什么?

Q1: Since data is physically sorted based on index, there is no extra space needed here. is this correct? so what happens when I drop the index I created?

非聚集索引:在非聚簇索引,树的叶节点包含列值和指向实际行的指针(行定位器)数据库。这里有一个额外的空间,需要在磁盘上物理存储这个非聚集索引表。但是,一个不受非聚集索引数量的限制。

Non-clustered Index: In non-clustered indexes, the leaf-node of the tree contains the columns values and a pointer (row locator) to the actual row in the database. Here there is extra space needed for storing this non-clustered index table physically on disk. However, one is not limited by the number of non-clustered Indexes.

Q2 :这是否意味着对非聚集索引列的查询不会导致排序数据?

Q2: Does it mean a query on non-clustered index column will not result in the sorted data?

Q3 :还有一个额外的外观这里与-up相关联,以使用叶节点处的指针定位实际的行数据。与聚集索引相比,这会有多大的性能差异?

Q3: There is an extra look-up associated here to locate the actual row data using the pointer at the leaf node. How much performance difference would this be when compared to a clustered index?

练习:

考虑一个Employee表:

consider an Employee table:

CREATE TABLE Employee
(
PersonID int PRIMARY KEY,
Name varchar(255),
age int,
salary int
); 

现在我创建了一个员工表(创建了员工的默认聚集索引)。

Now I created an employee table ( a default clustered index on employee is created).

此表上的两个常见查询仅发生在年龄和工资列上。为简单起见,
假设表格不经常更新

Two frequent queries on this table happen only on age and salary columns. For sake of simplicity, lets assume that the table is NOT frequently updated

例如:

select * from employee where age > XXX;

select * from employee where salary > XXXX and salary < YYYY;

Q4 :构建索引的最佳方法是什么,以便查询在这两列上都有类似的表现。如果我有关于年龄列的年龄查询的聚集索引会更快但是比工资列更慢。

Q4 : what is the best way to construct indexes, so that queries on both these column have similar performance. If I have clustered index on age queries on age column will be faster but than on salary column will be slower.

Q5 :关于相关请注意,我已经多次看到应该在具有唯一约束的列上创建索引(聚簇和非聚簇)。这是为什么?如果没有这样做会发生什么?

Q5: On a related note, I have repeatedly seen that indexes (both clustered and non-clustered) should be created on column with unique constraints. why is that? what will happen on failure to do this?

非常感谢
我读到的帖子在这里:

Thank you very much Posts I read are here:

http://javarevisited.blogspot.com/2013/08/difference-between-clustered-index-and-nonclustered-index-sql-server-database.html

http://msdn.microsoft.com/en-us /library/ms190457.aspx

群集与非群集

Clustered和Non clustered index究竟是什么意思?

群集索引和非群集索引之间有什么区别?

数据库索引如何运作?

推荐答案

对于SQL Server

For SQL Server

Q1 额外空间是仅当聚簇索引不唯一时才需要它。 SQL Server将在内部向非唯一聚簇索引添加一个4字节的uniquifier。这是因为它使用群集密钥作为非聚集索引中的rowid。

Q1 Extra space is only needed for the clustered index if it is not unique. SQL Server will add a 4 byte uniquifier internally to a non-unique clustered index. This is because it uses the cluster key as a rowid in non-clustered indexes.

Q2 可以按顺序读取非聚集索引。这可能有助于您指定订单的查询。它也可能使合并连接变得有吸引力。它还有助于范围查询(x< col和y> col)。

Q2 A non-clustered index can be read in order. That may aid queries where you specify an order. It may also make merge joins attractive. It will also help with range queries (x < col and y > col).

Q3 SQL Server执行额外的书签查找使用非聚集索引时。但是,只有当它需要一个不在索引中的列时才会这样。另请注意,您可以在索引的叶级别中包含额外列。如果可以在没有额外查找的情况下使用索引,则将其称为覆盖索引。

Q3 SQL Server does an extra "bookmark lookup" when using a non-clustered index. But, this is only if it needs a column that isn't in the index. Note also, that you can include extra columns in the leaf level of indexs. If an index can be used without the additional lookup it is called a covering index.

如果需要书签查找,则不需要占用大部分行只扫描整个聚簇索引更快。级别取决于行大小,密钥大小等。但行的5%是典型的截止。

If a bookmark lookup is required, it doesn't take a high percentage of rows until it's quicker just to scan the whole clustered index. The level depends on row size, key size etc. But 5% of rows is a typical cut off.

Q4 如果最重要的话在您的应用程序中尽可能快地进行这两个查询,您可以在它们上创建覆盖索引:

Q4 If the most important thing in your application was making both these queries as fast as possible, you could create covering index on both of them:

create index IX_1 on employee (age) include (name, salary);
create index IX_2 on employee (salary) include (name, age);

请注意,您不必专门包含群集密钥,因为非聚集索引具有它作为行指针。

Note you don't have to specifically include the cluster key, as the non-clustered index has it as the row pointer.

Q5 由于uniquifier,这对于群集密钥比非群集密钥更重要。但真正的问题是索引是否对您的查询具有选择性。想象一下值的索引。除非数据分布非常偏差,否则这样的索引不太可能用于任何事情。

Q5 This is more important for cluster keys than non-cluster keys due to the uniquifier. The real issue though is whether an index is selective or not for your queries. Imagine an index on a bit value. Unless the distribution of data is very skewed, such an index is unlikely to be used for anything.

有关该数据的更多信息唯一标志。想象一下你和一个关于年龄的非唯一聚簇索引,以及一个关于薪水的非聚集索引。假设您有以下行:

More info about the uniquifier. Imagine you and a non unique clustered index on age, and a non-clustered index on salary. Say you had the following rows:

age | salary | uniqifier
20  | 1000   | 1
20  | 2000   | 2

然后工资指数会找到这样的行

Then the salary index would locate rows like so

1000 -> 20, 1
2000 -> 20, 2

假设您运行查询 select * from employee where salary = 1000 ,优化器选择使用工资指数。然后它会从索引查找中找到对(20,1),然后在主数据中查找该值。

Say you ran the query select * from employee where salary = 1000, and the optimizer chose to use the salary index. It would then find the pair (20, 1) from the index lookup, then lookup this value in the main data.

这篇关于在SQL中基于集群索引和非集群索引优化查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆