选择性在索引扫描/搜索中的作用 [英] Role of selectivity in index scan/seek

查看:131
本文介绍了选择性在索引扫描/搜索中的作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在阅读许多SQL书籍和文章中,选择性是创建索引的重要因素。如果一列具有低选择性,则索引搜索会带来更多伤害。但这些文章都没有解释原因。任何人都能解释为什么会这样,或提供相关文章的链接吗?

I have been reading in many SQL books and articles that selectivity is an important factor in creating index. If a column has low selectivity, an index seek does more harm that good. But none of the articles explain why. Can anybody explain why it is so, or provide a link to a relevant article?

推荐答案

来自Robert Sheldon的SimpleTalk文章:< a href =https://www.simple-talk.com/sql/performance/14-sql-server-indexing-questions-you-were-too-shy-to-ask/ =noreferrer> 14您无法提出的SQL Server索引问题

From SimpleTalk article by Robert Sheldon: 14 SQL Server Indexing Questions You Were Too Shy To Ask


键列中唯一值的比率称为索引
选择性。值越独特,选择性越高,
这意味着唯一索引具有尽可能高的选择性。
查询引擎喜欢高度选择性的键列,特别是如果在
运行查询的WHERE子句中引用了这些列的
。选择性越高,查询引擎
可以越快地减小结果集的大小。当然,另一方面是
,具有相对较少的唯一值的列很少是一个好的
候选者被索引。

The ratio of unique values within a key column is referred to as index selectivity. The more unique the values, the higher the selectivity, which means that a unique index has the highest possible selectivity. The query engine loves highly selective key columns, especially if those columns are referenced in the WHERE clause of your frequently run queries. The higher the selectivity, the faster the query engine can reduce the size of the result set. The flipside, of course, is that a column with relatively few unique values is seldom a good candidate to be indexed.

另请查看这些文章:

  • Check this post by Pinal Dave
  • this other on SQL Serverpedia
  • This forum post on SqlServerCentral can help you too.
  • This article on SqlServerCentral also

来自SqlServerCentral文章:

From the SqlServerCentral article:


一般情况下,非聚集索引应该是sel ective。也就是说,列中的
值应该是相当独特的,并且在其上过滤
的查询应返回表的一小部分。

In general, a nonclustered index should be selective. That is, the values in the column should be fairly unique and queries that filter on it should return small portions of the table.

这样做的原因是密钥/ RID查找是昂贵的操作
,如果要使用非聚集索引来评估查询,则需要
来覆盖或有足够的选择性以查找成本
不被认为太高。

The reason for this is that key/RID lookups are expensive operations and if a nonclustered index is to be used to evaluate a query it needs to be covering or sufficiently selective that the costs of the lookups aren’t deemed to be too high.

如果SQL考虑索引(或
查询将寻求的索引键的子集)没有足够的选择性,那么索引将被忽略并且查询作为
聚集索引(表格)扫描执行,这是非常

If SQL considers the index (or the subset of the index keys that the query would be seeking on) insufficiently selective then it is very likely that the index will be ignored and the query executed as a clustered index (table) scan.

它重要的是要注意,这不仅适用于领先的
列。在某些情况下,非常非选择性的列可以是
用作前导列,索引中的其他列使得
选择性足以使用。

It is important to note that this does not just apply to the leading column. There are scenarios where a very unselective column can be used as the leading column, with the other columns in the index making it selective enough to be used.

这篇关于选择性在索引扫描/搜索中的作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆