更新 1.2 亿条记录的最快方法 [英] Fastest way to update 120 Million records

查看：29 发布时间：2021/8/25 18:49:29 sql sql-server sql-server-2005

本文介绍了更新 1.2 亿条记录的最快方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在 1.2 亿条记录表中初始化一个值为 -1 的新字段.

I need to initialize a new field with the value -1 in a 120 Million record table.

Update table
       set int_field = -1;

在取消之前我让它运行了 5 个小时.

I let it run for 5 hours before canceling it.

我尝试在事务级别设置为读取未提交的情况下运行它，但结果相同.

I tried running it with transaction level set to read uncommitted with the same results.

Recovery Model = Simple.
MS SQL Server 2005

有什么建议可以更快地完成这项工作吗?

Any advice on getting this done faster?

推荐答案

更新 120M 记录表的唯一合理方法是使用 SELECT 语句填充秒桌子.执行此操作时必须小心.下面的说明.

The only sane way to update a table of 120M records is with a SELECT statement that populates a second table. You have to take care when doing this. Instructions below.

简单案例

对于没有聚集索引的表，在没有并发 DML 的时间内:

For a table w/out a clustered index, during a time w/out concurrent DML:

SELECT *, new_col = 1 INTO clone.BaseTable FROM dbo.BaseTable
在新表上重新创建索引、约束等
使用 ALTER SCHEMA ... TRANSFER 切换旧的和新的.
删除旧表

如果您无法创建克隆架构，则可以使用同一架构中的不同表名.请记住在切换后重命名所有约束和触发器(如果适用).

If you can't create a clone schema, a different table name in the same schema will do. Remember to rename all your constraints and triggers (if applicable) after the switch.

非简单案例

首先，在不同的架构下使用相同的名称重新创建您的BaseTable，例如clone.BaseTable.使用单独的架构将简化稍后的重命名过程.

First, recreate your BaseTable with the same name under a different schema, eg clone.BaseTable. Using a separate schema will simplify the rename process later.

包括聚集索引(如果适用).请记住，主键和唯一约束可能会聚集在一起，但不一定如此.
包括标识列和计算列(如果适用).
包括您的新 INT 列，无论它属于哪里.
请勿包含以下任何内容:
- 触发器
- 外键约束
- 非聚集索引/主键/唯一约束
- 检查约束或默认约束.默认值没有太大区别，但我们正在努力保持事情最少.
- Include the clustered index, if applicable. Remember that primary keys and unique constraints may be clustered, but not necessarily so.
- Include identity columns and computed columns, if applicable.
- Include your new INT column, wherever it belongs.
- Do not include any of the following:
  - triggers
  - foreign key constraints
  - non-clustered indexes/primary keys/unique constraints
  - check constraints or default constraints. Defaults don't make much of difference, but we're trying to keep things minimal.
  然后，用 1000 行测试您的插入:
  
  Then, test your insert w/ 1000 rows:
```
-- assuming an IDENTITY column in BaseTable
SET IDENTITY_INSERT clone.BaseTable ON
GO
INSERT clone.BaseTable WITH (TABLOCK) (Col1, Col2, Col3)
SELECT TOP 1000 Col1, Col2, Col3 = -1
FROM dbo.BaseTable
GO
SET IDENTITY_INSERT clone.BaseTable OFF
```
  检查结果.如果一切都按顺序出现:
  
  Examine the results. If everything appears in order:
  - 截断克隆表
  - 确保数据库处于大容量日志或简单恢复模式
  - 执行完整插入.
  这将需要一段时间，但不会像更新一样长.完成后，检查克隆表中的数据以确保一切正确.
  
  This will take a while, but not nearly as long as an update. Once it completes, check the data in the clone table to make sure it everything is correct.
  
  然后，重新创建所有非聚集主键/唯一约束/索引和外键约束(按此顺序).如果适用，重新创建默认值并检查约束.重新创建所有触发器.在单独的批处理中重新创建每个约束、索引或触发器.例如:
  
  Then, recreate all non-clustered primary keys/unique constraints/indexes and foreign key constraints (in that order). Recreate default and check constraints, if applicable. Recreate all triggers. Recreate each constraint, index or trigger in a separate batch. eg:
```
ALTER TABLE clone.BaseTable ADD CONSTRAINT UQ_BaseTable UNIQUE (Col2)
GO
-- next constraint/index/trigger definition here
```
  最后，将 dbo.BaseTable 移动到备份架构，将 clone.BaseTable 移动到 dbo 架构(或您的表应该存在的任何地方).
  
  Finally, move dbo.BaseTable to a backup schema and clone.BaseTable to the dbo schema (or wherever your table is supposed to live).
```
-- -- perform first true-up operation here, if necessary
-- EXEC clone.BaseTable_TrueUp
-- GO
-- -- create a backup schema, if necessary
-- CREATE SCHEMA backup_20100914
-- GO
BEGIN TRY
  BEGIN TRANSACTION
  ALTER SCHEMA backup_20100914 TRANSFER dbo.BaseTable
  -- -- perform second true-up operation here, if necessary
  -- EXEC clone.BaseTable_TrueUp
  ALTER SCHEMA dbo TRANSFER clone.BaseTable
  COMMIT TRANSACTION
END TRY
BEGIN CATCH
  SELECT ERROR_MESSAGE() -- add more info here if necessary
  ROLLBACK TRANSACTION
END CATCH
GO
```
  如果您需要释放磁盘空间，此时您可以删除原始表，但最好将其保留一段时间.
  
  If you need to free-up disk space, you may drop your original table at this time, though it may be prudent to keep it around a while longer.
  
  不用说，这是理想的离线操作.如果有人在执行此操作时修改数据，则必须使用架构开关执行校准操作.我建议在 dbo.BaseTable 上创建一个触发器来将所有 DML 记录到一个单独的表中.在开始插入之前启用此触发器.然后在您执行模式传输的同一事务中，使用日志表执行校准.首先在数据的子集上测试！Delta 很容易搞砸.
  
  Needless to say, this is ideally an offline operation. If you have people modifying data while you perform this operation, you will have to perform a true-up operation with the schema switch. I recommend creating a trigger on dbo.BaseTable to log all DML to a separate table. Enable this trigger before you start the insert. Then in the same transaction that you perform the schema transfer, use the log table to perform a true-up. Test this first on a subset of the data! Deltas are easy to screw up.
  
  这篇关于更新 1.2 亿条记录的最快方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

更新 1.2 亿条记录的最快方法 [英] Fastest way to update 120 Million records

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

更新 1.2 亿条记录的最快方法 [英] Fastest way to update 120 Million records

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭