如何向具有大量行的现有数据库表添加标识列 [英] How to add an identity column to an existing database table which has large number of rows

查看:108
本文介绍了如何向具有大量行的现有数据库表添加标识列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据库表,其中有〜40 000 000行.我想向该表添加一个标识列.如何以对日志友好的方式进行操作?

I have a database table which has ~ 40 000 000 rows. I want to add an identity column to this table. How to do it in a log-friendly manner?

当我执行以下操作时:

ALTER TABLE table_1
  ADD id INT IDENTITY

这只会填满整个日志空间.

this just fills up the entire log space.

有什么方法可以以日志友好的方式进行操作?该数据库位于SQL Server 2008上.

Is there any way to do it in a log-friendly manner? The database is on SQL Server 2008.

谢谢, 莫汉.

推荐答案

总体过程可能会慢很多,并且总体锁定开销也会增加,但是如果您只关心事务日志的大小,则可以尝试以下操作.

The overall process will probably be a lot slower with more overall locking overhead but if you only care about transaction log size you could try the following.

  1. 添加可为空的整数非标识列(仅元数据更改).
  2. 编写代码以批量更新具有唯一顺序整数的代码.这将减小每个单独事务的大小,并减小日志大小(假设使用简单的恢复模型).我下面的代码以100为批次进行此操作,希望您已有一个可用的PK,您可以利用该PK从上次停止的地方开始工作,而不是反复进行扫描,而重复扫描将花费越来越长的时间.
  3. 使用ALTER TABLE ... ALTER COLUMN将列标记为NOT NULL.这将需要锁定并扫描整个表以验证更改,但不需要太多日志记录.
  4. 使用ALTER TABLE ... SWITCH将列设置为标识列.这是仅元数据更改.
  1. Add a nullable integer non identity column (metadata only change).
  2. Write code to update this with unique sequential integers in batches. This will reduce the size of each individual transaction and keep the log size down (assuming simple recovery model). My code below does this in batches of 100 hopefully you have an existing PK you can leverage to pick up where you left off rather than the repeated scans that will take increasingly long towards the end.
  3. use ALTER TABLE ... ALTER COLUMN to mark the column as NOT NULL. This will require the entire table to be locked and scanned to validate the change but not require much logging.
  4. Use ALTER TABLE ... SWITCH to make the column an identity column. This is a metadata only change.

下面的示例代码

/*Set up test table with just one column*/

CREATE TABLE table_1 ( original_column INT )
INSERT  INTO table_1
        SELECT DISTINCT
                number
        FROM    master..spt_values



/*Step 1 */
ALTER TABLE table_1 ADD id INT NULL



/*Step 2 */
DECLARE @Counter INT = 0 ,
    @PrevCounter INT = -1

WHILE @PrevCounter <> @Counter 
    BEGIN
        SET @PrevCounter = @Counter;
        WITH    T AS ( SELECT TOP 100
                                * ,
                                ROW_NUMBER() OVER ( ORDER BY @@SPID )
                                + @Counter AS new_id
                       FROM     table_1
                       WHERE    id IS NULL
                     )
            UPDATE  T
            SET     id = new_id
        SET @Counter = @Counter + @@ROWCOUNT
    END


BEGIN TRY;
    BEGIN TRANSACTION ;
     /*Step 3 */
    ALTER TABLE table_1 ALTER COLUMN id INT NOT NULL

    /*Step 4 */
    DECLARE @TableScript NVARCHAR(MAX) = '
    CREATE TABLE dbo.Destination(
        original_column INT,
        id INT IDENTITY(' + CAST(@Counter + 1 AS VARCHAR) + ',1)
        )

        ALTER TABLE dbo.table_1 SWITCH TO dbo.Destination;
    '       

    EXEC(@TableScript)


    DROP TABLE table_1 ;

    EXECUTE sp_rename N'dbo.Destination', N'table_1', 'OBJECT' ;


    COMMIT TRANSACTION ;
END TRY
BEGIN CATCH
    IF XACT_STATE() <> 0 
        ROLLBACK TRANSACTION ;
    PRINT ERROR_MESSAGE() ;
END CATCH ;

这篇关于如何向具有大量行的现有数据库表添加标识列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆