pandas.DataFrame.to_sql 中的最佳块大小参数 [英] Optimal chunksize parameter in pandas.DataFrame.to_sql

查看：114 发布时间：2021/6/13 20:34:16 python postgresql pandas

本文介绍了pandas.DataFrame.to_sql 中的最佳块大小参数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用需要转储到 PostgreSQL 表中的大型 Pandas DataFrame.从我读过的内容来看，一次全部转储并不是一个好主意，(我正在锁定数据库)而是使用 chunksize 参数.此处的答案对工作流程有帮助，但我只是询问块大小影响性能.

Working with a large pandas DataFrame that needs to be dumped into a PostgreSQL table. From what I've read it's not a good idea to dump all at once, (and I was locking up the db) rather use the chunksize parameter. The answers here are helpful for workflow, but I'm just asking about the value of chunksize affecting performance.

In [5]: df.shape
Out[5]: (24594591, 4)

In [6]: df.to_sql('existing_table',
                  con=engine, 
                  index=False, 
                  if_exists='append', 
                  chunksize=10000)

有没有推荐的默认值，把参数调高或调低，性能上有区别吗?假设我有内存支持更大的块大小，它会执行得更快吗?

Is there a recommended default and is there a difference in performance when setting the parameter higher or lower? Assuming I have the memory to support a larger chunksize, will it execute faster?

pandas.DataFrame.to_sql 中的最佳块大小参数 [英] Optimal chunksize parameter in pandas.DataFrame.to_sql

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas.DataFrame.to_sql 中的最佳块大小参数 [英] Optimal chunksize parameter in pandas.DataFrame.to_sql

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭