使用 pyODBC 的 fast_executemany 加速 pandas.DataFrame.to_sql [英] Speeding up pandas.DataFrame.to_sql with fast_executemany of pyODBC

查看：39 发布时间：2021/12/30 14:21:13 python sqlalchemy pyodbc pandas-to-sql

本文介绍了使用 pyODBC 的 fast_executemany 加速 pandas.DataFrame.to_sql的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想将一个大的 pandas.DataFrame 发送到运行 MS SQL 的远程服务器.我现在这样做的方法是将 data_frame 对象转换为元组列表，然后使用 pyODBC 的 executemany() 函数将其发送出去.它是这样的:

I would like to send a large pandas.DataFrame to a remote server running MS SQL. The way I do it now is by converting a data_frame object to a list of tuples and then send it away with pyODBC's executemany() function. It goes something like this:

 import pyodbc as pdb

 list_of_tuples = convert_df(data_frame)

 connection = pdb.connect(cnxn_str)

 cursor = connection.cursor()
 cursor.fast_executemany = True
 cursor.executemany(sql_statement, list_of_tuples)
 connection.commit()

 cursor.close()
 connection.close()

然后我开始怀疑是否可以通过使用 data_frame.to_sql() 方法来加快速度(或至少更具可读性).我想出了以下解决方案:

I then started to wonder if things can be sped up (or at least more readable) by using data_frame.to_sql() method. I have came up with the following solution:

 import sqlalchemy as sa

 engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % cnxn_str)
 data_frame.to_sql(table_name, engine, index=False)

现在代码更具可读性，但上传至少慢了 150 倍...

Now the code is more readable, but the upload is at least 150 times slower...

有没有办法在使用 SQLAlchemy 时翻转 fast_executemany?

Is there a way to flip the fast_executemany when using SQLAlchemy?

我使用的是 pandas-0.20.3、pyODBC-4.0.21 和 sqlalchemy-1.1.13.

I am using pandas-0.20.3, pyODBC-4.0.21 and sqlalchemy-1.1.13.

推荐答案

在联系了SQLAlchemy的开发者之后，出现了解决这个问题的方法.非常感谢他们的出色工作！

After contacting the developers of SQLAlchemy, a way to solve this problem has emerged. Many thanks to them for the great work!

必须使用游标执行事件并检查是否已引发 executemany 标志.如果确实如此，请打开 fast_executemany 选项.例如:

One has to use a cursor execution event and check if the executemany flag has been raised. If that is indeed the case, switch the fast_executemany option on. For example:

from sqlalchemy import event

@event.listens_for(engine, 'before_cursor_execute')
def receive_before_cursor_execute(conn, cursor, statement, params, context, executemany):
    if executemany:
        cursor.fast_executemany = True

关于执行事件的更多信息可以在此处.

More information on execution events can be found here.

更新:在SQLAlchemy 1.3.0，因此不再需要此 hack.

UPDATE: Support for fast_executemany of pyodbc was added in SQLAlchemy 1.3.0, so this hack is not longer necessary.

这篇关于使用 pyODBC 的 fast_executemany 加速 pandas.DataFrame.to_sql的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用 pyODBC 的 fast_executemany 加速 pandas.DataFrame.to_sql [英] Speeding up pandas.DataFrame.to_sql with fast_executemany of pyODBC

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用 pyODBC 的 fast_executemany 加速 pandas.DataFrame.to_sql [英] Speeding up pandas.DataFrame.to_sql with fast_executemany of pyODBC

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭