更改大表中的列类型 [英] Change column types in a huge table
问题描述
我在 SQL Server 2008 R2 中有一个接近十亿行的表.我想将两列的数据类型从 int 更改为 bigint.两次 ALTER TABLE zzz ALTER COLUMN yyy
有效,但速度很慢.我怎样才能加快这个过程?我想将数据复制到另一个表,删除,创建,复制回并切换到简单恢复模式,或者以某种方式使用游标一次 1000 行,但我不确定这些是否真的会带来任何改进.
I have a table in SQL Server 2008 R2 with close to a billion rows. I want to change the datatype of two columns from int to bigint. Two times ALTER TABLE zzz ALTER COLUMN yyy
works, but it's very slow. How can I speed the process up? I was thinking to copy the data to another table, drop, create, copy back and switching to simple recovery mode or somehow doing it with a cursor a 1000 rows a time but I'm not sure if those will actually lead to any improvement.
推荐答案
根据您所做的更改,有时采用维护窗口会更容易.在那个窗口(没有人应该能够更改表中的数据)期间,您可以:
Depending on what change you are making, sometimes it can be easier to take a maintenance window. During that window (where nobody should be able to change the data in the table) you can:
- 删除指向旧列的所有索引/约束,并禁用触发器
- 添加一个具有新数据类型的新可空列(即使它应该是 NOT NULL)
- 更新新列,设置它等于旧列的值(您可以在单个事务的块中执行此操作(例如,使用
UPDATE TOP (10000) ... SET newcol = 一次影响 10000 行)oldcol WHERE newcol 为 NULL
) 并使用 CHECKPOINT 以避免超出您的日志) - 更新完成后,删除旧列
- 重命名新列(并在适当时添加 NOT NULL 约束)
- 重建索引并更新统计数据
- drop any indexes/constraints pointing to the old column, and disable triggers
- add a new nullable column with the new data type (even if it is meant to be NOT NULL)
- update the new column setting it equal to the old column's value (and you can do this in chunks of individual transactions (say, affecting 10000 rows at a time using
UPDATE TOP (10000) ... SET newcol = oldcol WHERE newcol IS NULL
) and with CHECKPOINT to avoid overrunning your log) - once the updates are all done, drop the old column
- rename the new column (and add a NOT NULL constraint if appropriate)
- rebuild indexes and update statistics
这里的关键是它允许您在第 3 步中以增量方式执行更新,这是您无法在单个 ALTER TABLE 命令中完成的.
The key here is that it allows you to perform the update incrementally in step 3, which you can't do in a single ALTER TABLE command.
这假设列在数据完整性方面没有发挥主要作用 - 如果它涉及一堆外键关系,则需要执行更多步骤.
This assumes the column is not playing a major role in data integrity - if it is involved in a bunch of foreign key relationships, there are more steps.
编辑
另外,只是大声想知道,我还没有对此进行任何测试(但将其添加到列表中).我想知道页面 + 行压缩在这里是否有帮助?如果将 INT 更改为 BIGINT,在进行压缩后,SQL Server 仍应将所有值视为仍然适合 INT.同样,我还没有测试这是否会使更改更快或更慢,或者首先添加压缩需要多长时间.只是把它扔在那里.
Also, and just wondering out loud, I haven't done any testing for this (but adding it to the list). I wonder if page + row compression would help here? If you change an INT to a BIGINT, with compression in place SQL Server should still treat all values as if they still fit in an INT. Again, I haven't tested if this would make an alter faster or slower, or how much longer it would take to add compression in the first place. Just throwing it out there.
这篇关于更改大表中的列类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!