TSql,在数据输入之前或之后建立索引 [英] TSql, building indexes before or after data input

查看:49
本文介绍了TSql,在数据输入之前或之后建立索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于索引大量数据的性能问题.我有一个大表(约 3000 万行),其中 4 列被索引以允许快速搜索.目前我设置了索引(索引?),然后导入我的数据.这大约需要 4 个小时,具体取决于数据库服务器的速度.先导入数据,再建索引会更快/更高效吗?

解决方案

我会缓和 af 的回答,说可能先索引,后插入"会更慢而不是先插入,后索引",在这种情况下,您将记录插入具有聚集索引的表中,但不按该索引的自然顺序插入记录.原因是对于每次插入,数据行本身必须在磁盘上进行排序.

例如,考虑一个在 uniqueidentifier 字段上具有聚集主键的表.guid 的(几乎)随机性质意味着有可能在数据的顶部添加一行,导致当前页面中的所有数据都被混洗(可能还有较低页面中的数据),但是添加在底部的下一行.如果聚类是在日期时间列上,并且您碰巧按日期顺序添加行,那么记录自然会以正确的顺序插入到磁盘上,并且不需要昂贵的数据排序/改组操作.

我会支持 Winston Smith 的视情况而定"的回答,但建议您的聚集索引可能是确定哪种策略在您当前的情况下更快的重要因素.您甚至可以尝试根本没有聚集索引,看看会发生什么.让我知道?

Performance question about indexing large amounts of data. I have a large table (~30 million rows), with 4 of the columns indexed to allow for fast searching. Currently I set the indexs (indices?) up, then import my data. This takes roughly 4 hours, depending on the speed of the db server. Would it be quicker/more efficient to import the data first, and then perform index building?

解决方案

I'd temper af's answer by saying that it would probably be the case that "index first, insert after" would be slower than "insert first, index after" where you are inserting records into a table with a clustered index, but not inserting records in the natural order of that index. The reason being that for each insert, the data rows themselves would be have to be ordered on disk.

As an example, consider a table with a clustered primary key on a uniqueidentifier field. The (nearly) random nature of a guid would mean that it is possible for one row to be added at the top of the data, causing all data in the current page to be shuffled along (and maybe data in lower pages too), but the next row added at the bottom. If the clustering was on, say, a datetime column, and you happened to be adding rows in date order, then the records would naturally be inserted in the correct order on disk and expensive data sorting/shuffling operations would not be needed.

I'd back up Winston Smith's answer of "it depends", but suggest that your clustered index may be a significant factor in determining which strategy is faster for your current circumstances. You could even try not having a clustered index at all, and see what happens. Let me know?

这篇关于TSql,在数据输入之前或之后建立索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆