Sql Server - 从 VLT 中删除列(超大表) [英] Sql Server - Drop column from VLT (Very Large Table )
问题描述
任何人都可以请教什么是最好的实现以下目标:
Can anyone please advise what is the best to achieve the below:
要求:从 VLT 中删除 5 列(近 400 GB).
Requirement: Drop 5 columns from VLT (nearly 400 gb) in size.
当我们尝试做同样的事情时,我们在 PRODUCTION 上面临空间问题、超时错误(通过 SSMS)
The moment we try do the same, we face space issues on PRODUCTION, timeout errors (thru SSMS)
我们尝试在任何临时表中插入(通过保持身份关闭),但随后我们插入了所有近数十亿行数据并尝试打开身份,我们面临超时错误.
We tried to insert in any temp table (by keeping identity off), but then we inserted all nearly billions rows of data and we tried to switch the identity on, we are facing timeout errors.
我们是否应该通过 POWERSHELL 进行这些操作,与在 SSMS 中进行相比会更好
should we do these operations thru POWERSHELL would be better as compared to doing in SSMS
局限:生产空间有限,tempdb因为这些操作增长很快.
Limitation: Limited space on production, tempdb growing fast because of these operations.
请告知,从 VLT 中删除列的最佳方法是什么.
Please advise, what could be the best approach to drop column from VLT.
问候
推荐答案
我会采用已经提到的方法之一,但有一些关键的修改.假设您使用的是 SQL Server 2008,请执行以下操作:
I would take one of the approaches already mentioned but with some key modifications. Assuming you are on SQL Server 2008, do the following:
制作现有的非常大的表的零长度副本,其中仅包含您要保留的列:
Make a zero-length copy of your existing very large table with only the columns you want to keep:
select top 0 {{column subset}} into tbl_tableB from tableA
确保还将所有索引、约束等复制到新表中.标识列将由 SELECT...INTO
语句适当处理.
Be sure to also copy any indexes, constraints, etc. to the new table. Identity columns will be handled appropriately by the SELECT...INTO
statement.
重命名原表;我们将在下一步中用视图替换它.
Rename the original table; we will replace it with a view in the next step.
exec sys.sp_rename @objname = 'tableA', @newname = 'tbl_tableA'
使用原始表名和UNION ALL
创建一个视图:
create view tableA
as
select {{column subset}} from tbl_tableA
union all
select {{column subset}} from tbl_tableB
这将与查询数据的应用程序保持一定程度的兼容性.INSERTs
、UPDATEs
和 DELETEs
必须通过视图上的触发器来处理.UNION ALL
将防止 tempdb 中的压力,因为将没有排序(相对于直接的 UNION
),并且我们永远不会有超过一个行的副本存在于一段时间.
This will maintain some level of compatibility with applications querying the data. INSERTs
, UPDATEs
, and DELETEs
will have to be handled via triggers on the view. The UNION ALL
will prevent pressure in tempdb since there will be no sorting (versus a straight UNION
), and we will never have more than one copy of a row in existence at a time.
使用DELETE
结合OUTPUT
子句从原表批量删除数据,同时插入新表:
Use a DELETE
combined with an OUTPUT
clause to delete data in batches from the original table and simultaneously insert it into the new table:
BEGIN TRAN
DELETE TOP (1000) /* or whatever batch size you want */
FROM
tbl_tableA
OUTPUT (
DELETED.{{column subset}} /* have to list each column here prefixed by DELETED. */
)
INTO
tbl_tableB (
{{column subset}} /* again list each column here */
)
/* Check for errors */
/* COMMIT or ROLLBACK */
/* rinse and repeat [n] times */
完成DELETEs
/INSERTs
后,删除视图,删除原始表,重命名新表:
Once you're done with the DELETEs
/INSERTs
, drop the view, drop the original table, rename the new table:
drop view tableA
drop table tbl_tableA
exec sys.sp_rename @objname = 'tbl_tableB', @newname = 'tableA'
这种方法的主要优点是 DELETE
和 INSERT
在同一个事务中同时发生,这意味着数据将始终处于一致状态.您可以通过更改 TOP
子句来增加批处理的大小,从而更好地控制事务日志的使用和阻塞.我已经在带有和不带有标识列的表上测试了这种确切的方法,并且效果很好.在一张非常大的桌子上,运行需要一段时间;可能需要几个小时到几天,但它会以预期的结果完成.
The overriding merit of this approach is that the DELETE
and INSERT
happen simultaneously in the same transaction, meaning the data will always be in a consistent state. You can increase the size of the batch by changing the TOP
clause, giving you more control over transaction log usage and blocking. I've tested this exact approach on tables with and without identity columns and it works great. On a very large table, it will take a while to run; could be several hours to several days but it will complete with the desired result.
这篇关于Sql Server - 从 VLT 中删除列(超大表)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!