如何将 Pandas DataFrame 更新到 Microsoft SQL Server 表? [英] How to upsert pandas DataFrame to Microsoft SQL Server table?

查看:86
本文介绍了如何将 Pandas DataFrame 更新到 Microsoft SQL Server 表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将我的 Pandas DataFrame 插入到 SQL Server 表中.这个问题对 PostgreSQL 有一个可行的解决方案,但 T-SQL 没有 ON CONFLICTINSERT 的变体.我怎样才能为 SQL Server 完成同样的事情?

I would like to upsert my pandas DataFrame into a SQL Server table. This question has a workable solution for PostgreSQL, but T-SQL does not have an ON CONFLICT variant of INSERT. How can I accomplish the same thing for SQL Server?

推荐答案

有两个选项:

  1. 使用 MERGE 语句代替 INSERT ... ON CONFLICT.
  2. 使用带有 JOIN 和条件 INSERTUPDATE 语句.
  1. Use a MERGE statement instead of INSERT ... ON CONFLICT.
  2. Use an UPDATE statement with a JOIN, followed by a conditional INSERT.

T-SQL MERGE 文档 说:

性能提示:当两个表具有复杂的匹配特征混合时,为 MERGE 语句描述的条件行为效果最佳.例如,如果一行不存在则插入,或者如果匹配则更新一行.当简单地根据另一个表的行更新一个表时,使用基本的 INSERT、UPDATE 和 DELETE 语句提高性能和可伸缩性.

Performance Tip: The conditional behavior described for the MERGE statement works best when the two tables have a complex mixture of matching characteristics. For example, inserting a row if it doesn't exist, or updating a row if it matches. When simply updating one table based on the rows of another table, improve the performance and scalability with basic INSERT, UPDATE, and DELETE statements.

在许多情况下,简单地使用单独的 UPDATEINSERT 语句更快、更简单.

In many cases it is faster and less complicated to simply use the separate UPDATE and INSERT statements.

engine = sa.create_engine(
    connection_uri, fast_executemany=True, isolation_level="SERIALIZABLE"
)

with engine.begin() as conn:
    # step 0.0 - create test environment
    conn.execute(sa.text("DROP TABLE IF EXISTS main_table"))
    conn.execute(
        sa.text(
            "CREATE TABLE main_table (id int primary key, txt varchar(50))"
        )
    )
    conn.execute(
        sa.text(
            "INSERT INTO main_table (id, txt) VALUES (1, 'row 1 old text')"
        )
    )
    # step 0.1 - create DataFrame to UPSERT
    df = pd.DataFrame(
        [(2, "new row 2 text"), (1, "row 1 new text")], columns=["id", "txt"]
    )

    # step 1 - upload DataFrame to temporary table
    df.to_sql("#temp_table", conn, index=False, if_exists="replace")

    # step 2 - merge temp_table into main_table
    conn.execute(
        sa.text("""\
            UPDATE main SET main.txt = temp.txt
            FROM main_table main INNER JOIN #temp_table temp
                ON main.id = temp.id
            """
        )
    )
    conn.execute(
        sa.text("""\
            INSERT INTO main_table (id, txt) 
            SELECT id, txt FROM #temp_table
            WHERE id NOT IN (SELECT id FROM main_table) 
            """
        )
    )

    # step 3 - confirm results
    result = conn.execute(sa.text("SELECT * FROM main_table ORDER BY id")).fetchall()
    print(result)  # [(1, 'row 1 new text'), (2, 'new row 2 text')]

这篇关于如何将 Pandas DataFrame 更新到 Microsoft SQL Server 表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆