SQL Server 中非常大的表的 UPDATE 或 MERGE [英] UPDATE or MERGE of very big tables in SQL Server

查看:22
本文介绍了SQL Server 中非常大的表的 UPDATE 或 MERGE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对非常大(3 亿条记录)和广泛的 TABLE1 执行每日更新.更新的源数据位于另一个表 UTABLE 中,该表是 TABLE1 行的 10%-25%,但很窄.两个表都有 record_id 作为主键.

I need to perform a daily update of a very large (300M records) and broad TABLE1. The the source data for the updates is located in another table UTABLE that is 10%-25% the rows of TABLE1 but is narrow. Both tables have record_id as a primary key.

目前,我正在使用以下方法重新创建 TABLE1:

Presently, I am recreating TABLE1 using the following approach:

<!-- language: sql -->
    1) SELECT (required columns) INTO TMP_TABLE1 
    FROM TABLE1 T join UTABLE U on T.record_id=U.record_id  
    2) DROP TABLE TABLE1  
    3) sp_rename 'TMP_TABLE1', 'TABLE1'

但是,这在我的服务器上需要将近 40 分钟(SQL Server 需要 60GB RAM).我想实现 50% 的性能提升 - 我可以尝试哪些其他选项?

However this takes nearly 40 minutes on my server (60GB of RAM for SQL Server). I want to achieve a 50% performance gain - what other options can I try?

  1. MERGEUPDATE - 类似下面的代码仅适用于非常小的 UTABLE 表 - 全尺寸,一切都挂了:

  1. MERGE and UPDATE - something like the code below works faster only for a very small UTABLE table - at full size, everything just hangs:

<!-- language: SQL -->
MERGE TABLE1 as target  
USING UTABLE as source  
ON target.record_id = source.record_id   
  WHEN MATCHED THEN   
    UPDATE SET Target.columns=source.columns

  • 我听说我可以使用 ROWCOUNT 执行批量 MERGE - 但我认为它对于 300M 行表来说不够快.

  • I heard that I can perform a batch MERGE by using ROWCOUNT - but I don't think it can be fast enough for a 300M row table.

    任何有用的 SQL 查询提示?

    Any SQL query hints that can be helpful?

    推荐答案

    其实我已经找到了针对此类查询的一般建议: 使用 SQL Merge 或 Update 的想法是一个非常聪明的想法,但是当我们需要更新时它会失败一个大而宽的表(即240M)中有许多记录(即75M).

    Actually i've found out general recommendations for such a queries: Idea to use SQL Merge or Update is a very clever one but it fails when we need to update many records (i.e. 75M) in a big and wide table (i.e. 240M).

    查看下面查询的查询计划,我们可以说 TABLE1 的 TABLE SCAN 和最终的 MERGE 占用了 90% 的时间.

    Looking at the query plan of the query below we can say that TABLE SCAN of TABLE1 and final MERGE are taking 90% of time.

    MERGE TABLE1 as Target  
    USING UTABLE as source  
    ON Target.record_id = source.record_id   
    WHEN MATCHED AND (condition) THEN   
        UPDATE SET Target.columns=source.columns
    

    所以为了使用 MERGE,我们需要:

    So in order to use MERGE we need to:

    1. 减少我们需要更新的行数,并将此信息正确传递给 SQL Server.这可以通过使 UTABLE 更小或指定额外的 condition 来缩小要合并的部分来完成.
    2. 确保要合并的部分适合内存,否则查询运行速度会变慢.将 TABLE1 减少两倍,将我的实际查询时间从 11 小时缩短到 40 分钟.
    1. Reduce the number of rows we need to update and correctly pass this information to SQL Server. This can be done by making UTABLE smaller or specifying additional condition that narrows part to-be-merged.
    2. Make sure that part to-be-merged fits in memory otherwise query runs way slower. Making TABLE1 twice less reduced my real query time from 11 hours to 40 minutes.

    正如 Mark 提到的,您可以使用 UPDATE 语法并使用 WHERE 子句来缩小要合并的部分 - 这将产生相同的结果.另外请避免索引 TABLE1,因为这会导致在 MERGE

    As Mark mentioned you can use UPDATE syntax and use WHERE clause to narrow part to-be-merged - this will give same results. Also please avoid indexing TABLE1 as this will cause additional work to rebuild index during MERGE

    这篇关于SQL Server 中非常大的表的 UPDATE 或 MERGE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆