在SQL Server中合并两个表的最快选项 [英] Fastest options for merging two tables in SQL Server
问题描述
考虑两个非常大的表,表A中有2000万行,表B中的表与具有1000万行的TableA重叠很大。两者都有一个标识符列和一堆其他数据。我需要将所有项目从表B移到表A,以更新它们已经存在的地方。
Consider two very large tables, Table A with 20 million rows in, and Table B which has a large overlap with TableA with 10 million rows. Both have an identifier column and a bunch of other data. I need to move all items from Table B into Table A updating where they already exist.
Both table structures
- Identifier int
- Date DateTime,
- Identifier A
- Identifier B
- General decimal data.. (maybe 10 columns)
我可以很快获得表B中的项目,并很快获得表B中需要更新的项目。无法获得更新或删除插入内容以快速工作。有哪些选项可在最短的时间内将TableB的内容合并到TableA中(即更新现有记录而不是插入记录)?
I can get the items in Table B that are new, and get the items in Table B that need to be updated in Table A very quickly, but I can't get an update or a delete insert to work quickly. What options are available to merge the contents of TableB into TableA (i.e. updating existing records instead of inserting) in the shortest time?
我尝试提取TableB中的现有记录并在表A上运行大型更新以仅更新那些行(即每行一条更新语句),并且性能为
I've tried pulling out existing records in TableB and running a large update on table A to update just those rows (i.e. an update statement per row), and performance is pretty bad, even with a good index on it.
我还尝试过一次删除TableA中存在的TableA中不同值以及其性能的尝试。
I've also tried doing a one shot delete of the different values out of TableA that exist in TableB and performance of the delete is also poor, even with the indexes dropped.
我知道这可能很难快速执行,但是我正在寻找其他可用于
I appreciate that this may be difficult to perform quickly, but I'm looking for other options that are available to achieve this.
推荐答案
由于您要处理两个大表,因此就地更新/插入/合并可能会很耗时。我建议使用一些大容量日志记录技术,仅用于将所需内容加载到新表并执行表交换:
Since you deal with two large tables, in-place updates/inserts/merge can be time consuming operations. I would recommend to have some bulk logging technique just to load a desired content to a new table and the perform a table swap:
使用 SELECT INTO:
SELECT *
INTO NewTableA
FROM (
SELECT * FROM dbo.TableB b WHERE NOT EXISTS (SELECT * FROM dbo.TableA a WHERE a.id = b.id)
UNION ALL
SELECT * FROM dbo.TableA a
) d
exec sp_rename 'TableA', 'BackupTableA'
exec sp_rename 'NewTableA', 'TableA'
强烈建议使用
简单或至少批量记录恢复。另外,我认为这必须在工作时间之外完成,因为要在新表上重新创建大量丢失的对象:索引,默认约束,主键等。
Simple or at least Bulk-Logged recovery is highly recommended for such approach. Also, I assume that it has to be done out of business time since plenty of missing objects to be recreated on a new tables: indexes, default constraints, primary key etc.
这篇关于在SQL Server中合并两个表的最快选项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!