在SQL Server 2008中插入/更新大量数据的最佳做法 [英] Best practices for inserting/updating large amount of data in SQL Server 2008

查看:423
本文介绍了在SQL Server 2008中插入/更新大量数据的最佳做法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在建立一个通过各种CSV Feed更新大量数据的系统。通常我只是循环通过feed中的每一行,做一个select查询来检查项目是否已经存在,并插入/更新项目,如果它存在与否。



我觉得这种方法不是很可扩展,可以在更大的Feed上敲击服务器。我的解决方案是像正常循环项目,但将它们存储在内存中。然后对于每100个左右的项目做一个选择在这100个项目,并得到一个列表在数据库中匹配的现有项目。然后将insert / update语句连接在一起,并将它们运行到数据库中。这将基本上减少对数据库的访问。



这是一个可扩展性足够的解决方案,是否有任何有关将大型Feed导入生产环境的示例教程?



感谢

解决方案

看到您正在使用SQL Server 2008,我会推荐这种方法:




  • 首先将CSV文件批量复制到暂存表中

  • 使用MERGE命令从暂存表更新目标表



查看 MSDN文档有关如何使用MERGE命令的精彩博文



基本上,您创建




    • 行匹配,例如该行同时存在于源表和目标表中 - >通常您会更新一些字段,或者只是将其全部忽略。

    • 源中的行不存在目标 - >通常是INSERT的情况



    您将有 MERGE 语句如下:

      MERGE TargetTable AS t 
    使用SourceTable AS src
    ON t。 PrimaryKey = src.PrimaryKey

    当不匹配时
    INSERT(字段列表)
    VALUES(值列表)

    当匹配时
    UPDATE
    SET(SET语句列表)
    ;

    当然, ON 如果需要更多的参与。当然, WHEN 语句也可以更复杂,例如

      WHEN MATCHED AND(some other condition)THEN ...... 

    等等。 / p>

    MERGE 是一个非常强大和非常有用的新命令在SQL Server 2008 - 使用它,如果你可以! / p>

    I'm building a system for updating large amounts of data through various CSV feeds. Normally I would just loop though each row in the feed, do a select query to check if the item already exists and insert/update an item depending if it exists or not.

    I feel this method isn't very scalable and could hammer the server on larger feeds. My solution is to loop through the items as normal but store them in memory. Then for every 100 or so items do a select on those 100 items and get a list of existing items in the database that match. Then concatenate the insert/update statements together and run them into the database. This would essentially cut down on the trips to the database.

    Is this a scalable enough solution and are there any example tutorials on importing large feeds into a productive environment?

    Thanks

    解决方案

    Seeing that you're using SQL Server 2008, I would recommend this approach:

    • first bulkcopy your CSV files into a staging table
    • update your target table from that staging table using the MERGE command

    Check out the MSDN docs and a great blog post on how to use the MERGE command.

    Basically, you create a link between your actual data table and the staging table on a common criteria (e.g. a common primary key), and then you can define what to do when

    • the rows match, e.g. the row exists in both the source and the target table --> typically you'd either update some fields, or just ignore it all together
    • the row from the source doesn't exist in the target --> typically a case for an INSERT

    You would have a MERGE statement something like this:

    MERGE TargetTable AS t
    USING SourceTable AS src
    ON t.PrimaryKey = src.PrimaryKey
    
    WHEN NOT MATCHED THEN
      INSERT (list OF fields)
      VALUES (list OF values)
    
    WHEN MATCHED THEN
      UPDATE
        SET (list OF SET statements)
    ;
    

    Of course, the ON clause can be much more involved if needed. And of course, your WHEN statements can also be more complex, e.g.

    WHEN MATCHED AND (some other condition) THEN ......
    

    and so forth.

    MERGE is a very powerful and very useful new command in SQL Server 2008 - use it, if you can!

    这篇关于在SQL Server 2008中插入/更新大量数据的最佳做法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆