非聚集索引在堆上的性能与聚集索引的性能 [英] Performance of Non Clustered Indexes on Heaps vs Clustered Indexes

查看:86
本文介绍了非聚集索引在堆上的性能与聚集索引的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这份2007年白皮书比较了单个select/insert的性能/delete/update和range select语句在作为聚簇索引组织的表上与在与CI表相同的键列上具有非聚簇索引的堆组织的表上相比.

This 2007 White Paper compares the performance for individual select/insert/delete/update and range select statements on a table organized as a clustered index vs that on a table organized as a heap with a non clustered index on the same key columns as the CI table.

通常,聚集索引选项在测试中表现更好,因为只需要维护一种结构,并且因为不需要书签查找.

Generally the clustered index option performed better in the tests as there is only one structure to maintain and because there is no need for bookmark lookups.

本文未涉及的一个潜在有趣案例是将堆上的非聚簇索引与聚簇索引上的非聚簇索引进行比较.在那种情况下,我希望堆甚至可以在NCI叶级别上表现更好,因为SQL Server具有直接遵循的RID,而不需要遍历聚集索引.

One potentially interesting case not covered by the paper would have been a comparison between a non clustered index on a heap vs a non clustered index on a clustered index. In that instance I would have expected the heap might even perform better as once at the NCI leaf level SQL Server has a RID to follow directly rather than needing to traverse the clustered index.

有人知道在这一领域进行过类似的正式测试吗?如果是,结果是什么?

Is anyone aware of similar formal testing that has been carried out in this area and if so what were the results?

推荐答案

要检查您的请求,我按照以下方案创建了2个表:

To check your request I created 2 tables following this scheme:

    表示余额信息的
  • 790万条记录.
  • 从1到790万的身份字段
  • 一个数字字段,将记录分为约50万个组.
  • 7.9 million records representing balance information.
  • an identity field counting from 1 to 7.9 million
  • a number field grouping the records in about 500k groups.

第一个称为heap的表在字段group上获得了非聚集索引.第二个表clust在顺序字段key上有一个聚集索引,在字段group

The first table called heap got a non clustered index on the field group. The second table called clust got a clustered index on the sequential field called key and a nonclustered index on the field group

测试在具有2个超线程内核,4Gb内存和64位Windows 7的I5 M540处理器上运行.

The tests were run on an I5 M540 processor with 2 hyperthreaded cores, 4Gb memory and 64-bit windows 7.

Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64) 
Apr  2 2010 15:48:46 
Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)  

2011年3月9日更新:我通过运行以下.net代码并在Sql Server Profiler中记录了持续时间,CPU,读取,写入和行计数,做了第二次更广泛的基准测试. (使用的CommandText将在结果中提及.)

Update on 9 Mar 2011: I did a second more extensive benchmark by running the following .net code and logging Duration, CPU, Reads, Writes and RowCounts in Sql Server Profiler. (The CommandText used will be mentioned in the results.)

注意: CPU和持续时间以毫秒为单位

NOTE: CPU and Duration are expressed in milliseconds

  • 1000个查询
  • 从结果中消除
  • 零CPU查询
  • 从结果中删除
  • 0条受影响的行
  • 1000 queries
  • zero CPU queries are eliminated from the results
  • 0 rows affected are eliminated from the results

int[] idList = new int[] { 6816588, 7086702, 6498815 ... }; // 1000 values here.
using (var conn = new SqlConnection(@"Data Source=myserver;Initial Catalog=mydb;Integrated Security=SSPI;"))
            {
                conn.Open();
                using (var cmd = new SqlCommand())
                {
                    cmd.Connection = conn;
                    cmd.CommandType = CommandType.Text;
                    cmd.CommandText = "select * from heap where common_key between @id and @id+1000"; 
                    cmd.Parameters.Add("@id", SqlDbType.Int);
                    cmd.Prepare();
                    foreach (int id in idList)
                    {
                        cmd.Parameters[0].Value = id;

                        using (var reader = cmd.ExecuteReader())
                        {
                            int count = 0;
                            while (reader.Read())
                            {
                                count++;
                            }
                            Console.WriteLine(String.Format("key: {0} => {1} rows", id, count));
                        }
                    }
                }
            }

更新结束于2011年3月9日.

SELECT性能

要检查性能编号,我在堆表上一次在集群表上执行了以下查询:

SELECT performance

To check performanc numbers I performed the following queries once on the heap table and once on the clust table:

select * from heap/clust where group between 5678910 and 5679410
select * from heap/clust where group between 6234567 and 6234967
select * from heap/clust where group between 6455429 and 6455729
select * from heap/clust where group between 6655429 and 6655729
select * from heap/clust where group between 6955429 and 6955729
select * from heap/clust where group between 7195542 and 7155729


此基准测试的结果适用于heap:

rows  reads CPU   Elapsed 
----- ----- ----- --------
1503  1510  31ms  309ms
401   405   15ms  283ms
2700  2709  0ms   472ms
0     3     0ms   30ms
2953  2962  32ms  257ms
0     0     0ms   0ms

2011年3月9日更新: cmd.CommandText = "select * from heap where group between @id and @id+1000";

  • 721行的CPU> 0,并且影响超过0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    6368         -         
Cpu            15        374      37   0.00754
Reads        1069      91459    7682   1.20155
Writes          0          0       0   0.00000
Duration   0.3716   282.4850 10.3672   0.00180

更新结束于2011年3月9日.


clust的结果是:


for the table clust the results are:

rows  reads CPU   Elapsed 
----- ----- ----- --------
1503  4827  31ms  327ms
401   1241  0ms   242ms
2700  8372  0ms   410ms
0     3     0ms   0ms
2953  9060  47ms  213ms
0     0     0ms   0ms

2011年3月9日更新: cmd.CommandText = "select * from clust where group between @id and @id+1000";

  • 721行的CPU> 0,并且影响超过0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    6056         -
Cpu            15        468      38   0.00782
Reads        3194     227018   20457   3.37618
Writes          0          0       0       0.0
Duration   0.3949   159.6223 11.5699   0.00214

更新结束于2011年3月9日.


SELECT JOIN性能

cmd.CommandText = "select * from heap/clust h join keys k on h.group = k.group where h.group between @id and @id+1000";


SELECT WITH JOIN performance

cmd.CommandText = "select * from heap/clust h join keys k on h.group = k.group where h.group between @id and @id+1000";

此基准测试的结果适用于heap:

The results of this benchmark are for the heap:

873行的CPU> 0,并且影响超过0行

873 Rows have > 0 CPU and affect more than 0 rows

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1009       4170    1683         -
Cpu            15         47      18   0.01175
Reads        2145       5518    2867   1.79246
Writes          0          0       0   0.00000
Duration   0.8215   131.9583  1.9095   0.00123


此基准测试的结果适用于clust:

865行具有> 0 CPU,并且影响超过0行

865 Rows have > 0 CPU and affect more than 0 rows

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1000       4143    1685         -
Cpu            15         47      18   0.01193
Reads        5320      18690    8237   4.97813
Writes          0          0       0   0.00000
Duration   0.9699    20.3217  1.7934   0.00109

更新性能

第二批查询是更新语句:

UPDATE performance

The second batch of queries are update statements:

update heap/clust set amount = amount + 0 where group between 5678910 and 5679410
update heap/clust set amount = amount + 0 where group between 6234567 and 6234967
update heap/clust set amount = amount + 0 where group between 6455429 and 6455729
update heap/clust set amount = amount + 0 where group between 6655429 and 6655729
update heap/clust set amount = amount + 0 where group between 6955429 and 6955729
update heap/clust set amount = amount + 0 where group between 7195542 and 7155729


该基准测试的结果heap:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  3013  31ms  175ms
401   806   0ms   22ms
2700  5409  47ms  100ms
0     3     0ms   0ms
2953  5915  31ms  88ms
0     0     0ms   0ms

2011年3月9日更新: cmd.CommandText = "update heap set amount = amount + @id where group between @id and @id+1000";

  • 811行的CPU> 0,并且影响超过0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    5598       811         
Cpu            15        873      56   0.01199
Reads        2080     167593   11809   2.11217
Writes          0       1687     121   0.02170
Duration   0.6705   514.5347 17.2041   0.00344

更新结束于2011年3月9日.


clust的基准测试结果:


the results of this benchmark for the clust:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  9126  16ms  35ms
401   2444  0ms   4ms
2700  16385 31ms  54ms
0     3     0ms   0ms 
2953  17919 31ms  35ms
0     0     0ms   0ms

2011年3月9日更新: cmd.CommandText = "update clust set amount = amount + @id where group between @id and @id+1000";

  • 853行具有> 0 CPU,影响多于0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1001      69788    5420         -
Cpu            15        594      50   0.01073
Reads        6226     432237   33597   6.20450
Writes          0       1730     110   0.01971
Duration   0.9134   193.7685  8.2919   0.00155

更新结束于2011年3月9日.


删除基准

我运行的第三批查询是删除语句


DELETE benchmarks

the third batch of queries I ran are delete statements

delete heap/clust where group between 5678910 and 5679410
delete heap/clust where group between 6234567 and 6234967
delete heap/clust where group between 6455429 and 6455729
delete heap/clust where group between 6655429 and 6655729
delete heap/clust where group between 6955429 and 6955729
delete heap/clust where group between 7195542 and 7155729


heap的基准测试结果:


The result of this benchmark for the heap:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  10630 62ms  179ms
401   2838  0ms   26ms
2700  19077 47ms  87ms
0     4     0ms   0ms
2953  20865 62ms  196ms
0     4     0ms   9ms

2011年3月9日更新: cmd.CommandText = "delete heap where group between @id and @id+1000";

  • 724行具有> 0 CPU,影响多于0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts     192      69788    4781         -
Cpu            15        499      45   0.01247
Reads         841     307958   20987   4.37880
Writes          2       1819     127   0.02648
Duration   0.3775  1534.3383 17.2412   0.00349

更新结束于2011年3月9日.


clust的基准测试结果:


the result of this benchmark for the clust:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
1503  9228  16ms  55ms
401   3681  0ms   50ms
2700  24644 46ms  79ms
0     3     0ms   0ms
2953  26955 47ms  92ms
0     3     0ms   0ms

2011年3月9日更新:

cmd.CommandText = "delete clust where group between @id and @id+1000";

  • 751行具有> 0 CPU,影响多于0行

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts     144      69788    4648         -
Cpu            15        764      56   0.01538
Reads         989     458467   30207   6.48490
Writes          2       1830     127   0.02694
Duration   0.2938  2512.1968 24.3714   0.00555

更新结束于2011年3月9日.


INSERT基准测试

基准测试的最后一部分是执行插入语句.


INSERT benchmarks

The last part of the benchmark is the execution of insert statements.

插入堆/集群(...) 值(...), (...), (...), (...), (...), (...)

insert into heap/clust (...) values (...), (...), (...), (...), (...), (...)

heap的基准测试结果:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
6     38    0ms   31ms

2011年3月9日更新:

string str = @"insert into heap (group, currency, year, period, domain_id, mtdAmount, mtdAmount, ytdAmount, amount, ytd_restated, restated, auditDate, auditUser)
                    values";

                    for (int x = 0; x < 999; x++)
                    {
                        str += string.Format(@"(@id + {0}, 'EUR', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test'),  ", x);
                    }
                    str += string.Format(@"(@id, 'CAD', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test') ", 1000);

                    cmd.CommandText = str;

  • 912个语句具有> 0个CPU
  • 912 statements have > 0 CPU

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1000       1000    1000         -
Cpu            15       2138      25   0.02500
Reads        5212       7069    6328   6.32837
Writes         16         34      22   0.02222
Duration   1.6336   293.2132  4.4009   0.00440

更新结束于2011年3月9日.


此基准测试对clust的结果:


The result of this benchmark for the clust:

rows  reads CPU   Elapsed 
----- ----- ----- -------- 
6     50    0ms   18ms

2011年3月9日更新:

string str = @"insert into clust (group, currency, year, period, domain_id, mtdAmount, mtdAmount, ytdAmount, amount, ytd_restated, restated, auditDate, auditUser)
                    values";

                    for (int x = 0; x < 999; x++)
                    {
                        str += string.Format(@"(@id + {0}, 'EUR', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test'),  ", x);
                    }
                    str += string.Format(@"(@id, 'CAD', 2012, 2, 0, 100, 100, 1000 + @id,1000, 1000,1000, current_timestamp, 'test') ", 1000);

                    cmd.CommandText = str;

  • 946个语句具有> 0个CPU
  • 946 statements have > 0 CPU

Counter   Minimum    Maximum Average  Weighted
--------- ------- ---------- ------- ---------
RowCounts    1000       1000    1000         -      
Cpu            15       2403      21   0.02157
Reads        6810       8997    8412   8.41223
Writes         16         25      19   0.01942
Duration   1.5375   268.2571  6.1463   0.00614

更新结束于2011年3月9日.


结论

尽管使用群集&访问表时,逻辑读操作仍在进行.非聚集索引(使用非聚集索引时)的性能结果为:


Conclusions

Although there are more logical reads going on when accessing the table with the clustered & the nonclustered index (while using the nonclustered index) the performance results are:

  • SELECT语句具有可比性
  • 在具有聚集索引的情况下,UPDATE语句更快
  • 有了聚集索引,DELETE语句会更快
  • 在具有聚集索引的情况下,INSERT语句更快

当然,我的基准测试在特定类型的表和非常有限的查询集上非常受限制,但是我认为基于这些信息,我们已经可以开始说在上面创建集群索引实际上总是更好.您的桌子.

Of course my benchmark was very limited on a specific kind of table and with a very limited set of queries, but I think that based on this information we can already start saying that it is virtually always better to create a clustered index on your table.

2011年3月9日更新:

从添加的结果中我们可以看到,有限测试的结论在每种情况下都不正确.

As we can see from the added results, the conclusions on the limited tests were not correct in every case.

现在,结果表明,从聚集索引中受益的唯一语句是更新语句.在具有聚集索引的表上,其他语句要慢大约30%.

The results now indicate that the only statements which benefit from the clustered index are the update statements. The other statements are about 30% slower on the table with clustered index.

一些其他图表,其中我绘制了每个查询对堆与集群的加权持续时间.

Some additional charts where I plotted the weighted duration per query for heap vs clust.

如您所见,insert语句的性能配置文件非常有趣.尖峰是由一些数据点导致的,这些数据点需要花费更长的时间才能完成.

As you can see the performance profile for the insert statements is quite interesting. The spikes are caused by a few data points which take a lot longer to complete.

更新结束于2011年3月9日.

这篇关于非聚集索引在堆上的性能与聚集索引的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆