大量索引子项的批量插入(Sql Server 2008) [英] Bulk inserts of heavily indexed child items (Sql Server 2008)

查看:200
本文介绍了大量索引子项的批量插入(Sql Server 2008)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为数据库创建数据导入机制,该数据库要求读者具有高可用性,同时在调度时提供不规则的大量新数据。

I'm trying to create a data import mechanism for a database that requires high availability to readers while serving irregular bulk loads of new data as they are scheduled.

新数据仅涉及三个表,其中添加了新数据集以及由它们引用的许多新数据集项以及引用它们的一些数据集项元数据行。数据集可能包含数万个数据集项。

The new data involves just three tables with new datasets being added along with many new dataset items being referenced by them and a few dataset item metadata rows referencing those. Datasets may have tens of thousands of dataset items.

数据集项目在几个列组合上被大量索引,绝大多数(但不是全部)读取包括数据集ID在where子句中。由于索引,数据插入现在太慢,无法跟上流入,但因为这些索引的读者优先,我无法删除主表上的索引,但需要处理副本。

The dataset items are heavily indexed on several combinations of columns with the vast majority (but not all) reads including the dataset id in the where clause. Because of the indexes, data inserts are now too slow to keep up with inflows but because readers of those indexes take priority I can not remove the indexes on the main table but need to work on a copy.

因此,我需要一些工作表,我将其复制,插入并重新索引,然后快速切换它成为查询表/视图的一部分。问题是我如何快速执行该切换?

I therefore need some kind of working table that I copy into, insert into and reindex before quickly switching it to become part of the queried table/view. The question is how do I quickly perform that switch?

我已经研究过用一系列数据集id来划分数据集项表,这是一个外键,但是因为这不是主键的一部分SQL Server似乎没那么容易。我无法使用易于索引的更新版本切换旧数据分区。

I have looked into partitioning the dataset items table by a range of dataset id, which is a foreign key, but because this isn't part of the primary key SQL Server doesn't seem make that easy. I am not able to switch the old data partition with a readily indexed updated version.

不同的文章建议使用分区,快照隔离和分区视图,但没有一个直接回答这种情况,无论是批量加载和归档旧数据(按日​​期分区)还是简单的事务隔离而不考虑索引。

Different articles suggest use of partitioning, snapshot isolation and partitioned views but none directly answer this situation, being either about bulk loading and archiving of old data (partitioned by date) or simple transaction isolation without considering indexing.

是否有任何示例直接解决这个看似常见的问题?

Is there any examples that directly tackle this seemingly common problem?

人们有什么不同的策略可以真正减少在将新数据批量加载到大型索引表时禁用索引的时间?

推荐答案

请注意,对列进行分区需要将列作为聚簇索引键的一部分,而不是主要的一部分键。这两者是独立的。

Notice, that partitioning on a column requires the column to be part of the clustered index key, not part of the primary key. The two are independent.

尽管如此,分区对你可以在桌面上执行的操作施加了很多限制。例如,切换仅在所有索引都已对齐且没有外键引用要修改的表时才有效。

Still, partitioning imposes lots of constraints on what you operations you can perform on your table. For example, switching only works if all indexes are aligned and no foreign keys reference the table being modified.

如果您可以在所有这些限制下使用分区,则这是可能是最好的方法。分区视图为您提供了更大的灵活性,但具有类似的限制:所有索引显然是对齐的,并且传入的FK是不可能的。

If you can make use of partitioning under all of those restrictions this is probably the best approach. Partitioned views give you a more flexibility but have similar restrictions: All indexes are obviously aligned and incoming FKs are impossible.

分区数据并不容易。它不是一个点击式向导即可完成的解决方案。权衡取舍非常复杂。

Partitioning data is not easy. It is not a click-through-wizard-and-be-done solution. The set of tradeoffs is very complex.

这篇关于大量索引子项的批量插入(Sql Server 2008)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆