SQL Server索引哪些应该是群集的? [英] SQL Server Index Which should be clustered?

查看:146
本文介绍了SQL Server索引哪些应该是群集的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在某些表上有很多索引,它们都很相似,我想知道Clustered Index是否在正确的列上。以下是两个最活跃索引的统计数据:

I have a number of indexes on some tables, they are all similar and I want to know if the Clustered Index is on the correct column. Here are the stats from the two most active indexes:

Nonclustered
I3_Identity (bigint)
rows: 193,781
pages: 3821
MB: 29.85
user seeks: 463,355
user_scans: 784
user_lookups: 0
updates: 256,516

Clustered Primary Key
I3_RowId (varchar(80))
rows: 193,781
pages: 24,289
MB: 189.76
user_seeks: 2,473,413
user_scans: 958
user_lookups: 463,693
updates: 2,669,261

如您所见, PK经常被寻求,但是i3_identity专栏的所有搜索都在对这个PK进行关键查找,所以我真的从I3_Identity的索引中获益很多吗?我应该更改为使用I3_Identity作为群集吗?这可能会产生巨大的影响,因为这个表结构在我工作的地方重复了大约10000次,所以任何帮助都会受到赞赏。

As you can see, the PK is being seeked often, but all the seeks for the i3_identity column are doing key lookups to this PK as well, so am I really benefiting from the index on I3_Identity much at all? Should I change to using the I3_Identity as the clustered? This could have a huge impact as this table structure is repeated about 10000 times where I work, so any help would be appreciated.

推荐答案

弗雷德里克很好地总结了这一点,这正是金伯利·特里普所宣扬的:聚类键应该是稳定的(永不改变),不断增加(IDENTITY INT),小而独特。

Frederik sums it up nicely, and that's really what Kimberly Tripp also preaches: the clustering key should be stable (never changes), ever increasing (IDENTITY INT), small and unique.

在您的方案中,我更倾向于将聚类键放在BIGINT列而不是VARCHAR(80)列上。

In your scenario, I'd much rather put the clustering key on the BIGINT column rather than the VARCHAR(80) column.

首先,使用BIGINT列,可以相当容易地强制执行唯一性(如果您自己不强制执行并保证唯一性,SQL Server将为每个都添加一个4字节的uniquefier你的行)它的平均值比VARCHAR(80)小很多。

First of all, with the BIGINT column, it's reasonably easy to enforce uniqueness (if you don't enforce and guarantee uniqueness yourself, SQL Server will add a 4-byte "uniquefier" to each and every one of your rows) and it's MUCH smaller on average than a VARCHAR(80).

为什么尺寸如此重要?集群密钥也将被添加到EACH和每个非聚集索引中 - 所以如果你有很多行和很多非聚集索引,那么40-80字节对8字节可以很快成为巨大的区别。

Why is size so important? The clustering key will also be added to EACH and every one of your non-clustered indexes - so if you have a lot of rows and a lot of non-clustered indexes, having 40-80 byte vs. 8 byte can quickly make a HUGE difference.

另外,另一个性能提示:为了避免所谓的书签查找(从非聚集索引中的值通过聚类键进入实际数据叶) SQL Server 2005在非聚集索引中引入了包含列的概念。这些都非常有用,而且经常被忽视。如果您的查询通常需要索引字段加上数据库中的一个或两个其他字段,请考虑包含这些字段以实现所谓的覆盖索引。再次 - 请参阅Kimberly Tripp的精彩文章 - 她是SQL Server Indexing Goddess! :-)她可以比我更好地解释这些东西...

Also, another performance tip: in order to avoid the so-called bookmark lookups (from a value in your non-clustered index via the clustering key into the actual data leaf pages), SQL Server 2005 has introduced the notion of "included columns" in your non-clustered indexes. Those are extremely helpful, and often overlooked. If your queries often require the index fields plus just one or two other fields from the database, consider including those in order to achieve what is called "covering indexes". Again - see Kimberly Tripp's excellent article - she's the SQL Server Indexing Goddess! :-) and she can explain that stuff much better than I can...

总结一下:把你的聚类键放在一个小而稳定的独特专栏上 - 你会做得很好!

So to sum it up: put your clustering key on a small, stable, unique column - and you'll do just fine!

Marc

这篇关于SQL Server索引哪些应该是群集的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆