SQL Server中很大表的更新或合并 [英] UPDATE or MERGE of very big tables in SQL Server

查看:140
本文介绍了SQL Server中很大表的更新或合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要每天对非常大的记录(3亿条记录)和广泛的TABLE1进行更新.更新的源数据位于另一个表UTABLE中,该表占TABLE1行的10%-25%,但范围很窄.两个表都以record_id作为主键.

I need to perform a daily update of a very large (300M records) and broad TABLE1. The the source data for the updates is located in another table UTABLE that is 10%-25% the rows of TABLE1 but is narrow. Both tables have record_id as a primary key.

目前,我正在使用以下方法重新创建TABLE1:

Presently, I am recreating TABLE1 using the following approach:

<!-- language: sql -->
    1) SELECT (required columns) INTO TMP_TABLE1 
    FROM TABLE1 T join UTABLE U on T.record_id=U.record_id  
    2) DROP TABLE TABLE1  
    3) sp_rename 'TMP_TABLE1', 'TABLE1'

但是,这在我的服务器上需要近40分钟的时间(SQL Server为60GB的RAM).我想获得50%的性能提升-我还可以尝试其他哪些选择?

However this takes nearly 40 minutes on my server (60GB of RAM for SQL Server). I want to achieve a 50% performance gain - what other options can I try?

  1. MERGEUPDATE-类似于下面的代码,仅对于很小的UTABLE表工作更快-全尺寸时,所有内容都将挂起:

  1. MERGE and UPDATE - something like the code below works faster only for a very small UTABLE table - at full size, everything just hangs:

<!-- language: SQL -->
MERGE TABLE1 as target  
USING UTABLE as source  
ON target.record_id = source.record_id   
  WHEN MATCHED THEN   
    UPDATE SET Target.columns=source.columns

  • 我听说我可以使用ROWCOUNT来执行批处理MERGE-但我认为对于300M行表来说,它的速度还不够快.

  • I heard that I can perform a batch MERGE by using ROWCOUNT - but I don't think it can be fast enough for a 300M row table.

    是否有任何对SQL查询有用的提示?

    Any SQL query hints that can be helpful?

    推荐答案

    实际上,我已经找到了有关此类查询的一般建议:使用SQL Merge或Update的想法非常聪明,但是在需要更新时会失败.大而宽的表(即 240M )中的许多记录(即 75M ).

    Actually i've found out general recommendations for such a queries: Idea to use SQL Merge or Update is a very clever one but it fails when we need to update many records (i.e. 75M) in a big and wide table (i.e. 240M).

    看下面的查询的查询计划,我们可以说TABLE1的TABLE SCAN和最终的MERGE花费了90%的时间.

    Looking at the query plan of the query below we can say that TABLE SCAN of TABLE1 and final MERGE are taking 90% of time.

    MERGE TABLE1 as Target  
    USING UTABLE as source  
    ON Target.record_id = source.record_id   
    WHEN MATCHED AND (condition) THEN   
        UPDATE SET Target.columns=source.columns
    

    因此,要使用MERGE,我们需要:

    So in order to use MERGE we need to:

    1. 减少我们需要更新的行数,并将此信息正确传递给SQL Server.这可以通过减小UTABLE或指定其他condition来缩小要合并的部分来实现.
    2. 确保要合并的部分适合内存,否则查询运行速度会变慢.将TABLE1减少两倍,可以将我的实际查询时间从11小时减少到40分钟.
    1. Reduce the number of rows we need to update and correctly pass this information to SQL Server. This can be done by making UTABLE smaller or specifying additional condition that narrows part to-be-merged.
    2. Make sure that part to-be-merged fits in memory otherwise query runs way slower. Making TABLE1 twice less reduced my real query time from 11 hours to 40 minutes.

    正如Mark所提到的,您可以使用UPDATE语法并使用WHERE子句来缩小要合并的部分-这将产生相同的结果.另外,请避免索引TABLE1,因为这将导致在MERGE

    As Mark mentioned you can use UPDATE syntax and use WHERE clause to narrow part to-be-merged - this will give same results. Also please avoid indexing TABLE1 as this will cause additional work to rebuild index during MERGE

    这篇关于SQL Server中很大表的更新或合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆