Python Pandas-使用to_sql逐块写入大型数据帧 [英] Python Pandas - Using to_sql to write large data frames in chunks

查看：534 发布时间：2020/5/14 21:13:46 python mysql sql pandas sqlalchemy

本文介绍了Python Pandas-使用to_sql逐块写入大型数据帧的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Pandas的to_sql函数写入MySQL，这是由于较大的帧大小(1M行，20列)而导致的超时.

I'm using Pandas' to_sql function to write to MySQL, which is timing out due to large frame size (1M rows, 20 columns).

http://pandas.pydata.org/pandas-docs/stable /generation/pandas.DataFrame.to_sql.html

是否存在一种更正式的方式来对数据进行分块并在块中写入行?我已经编写了自己的代码，这似乎行得通.我希望有一个官方的解决方案.谢谢！

Is there a more official way to chunk through the data and write rows in blocks? I've written my own code, which seems to work. I'd prefer an official solution though. Thanks!

def write_to_db(engine, frame, table_name, chunk_size):

    start_index = 0
    end_index = chunk_size if chunk_size < len(frame) else len(frame)

    frame = frame.where(pd.notnull(frame), None)
    if_exists_param = 'replace'

    while start_index != end_index:
        print "Writing rows %s through %s" % (start_index, end_index)
        frame.iloc[start_index:end_index, :].to_sql(con=engine, name=table_name, if_exists=if_exists_param)
        if_exists_param = 'append'

        start_index = min(start_index + chunk_size, len(frame))
        end_index = min(end_index + chunk_size, len(frame))

engine = sqlalchemy.create_engine('mysql://...') #database details omited
write_to_db(engine, frame, 'retail_pendingcustomers', 20000)

推荐答案

更新:由于@artemyk，此功能已在pandas master中合并，并将在0.15(可能是9月底)发布.参见 https://github.com/pydata/pandas/pull/8062

Update: this functionality has been merged in pandas master and will be released in 0.15 (probably end of september), thanks to @artemyk! See https://github.com/pydata/pandas/pull/8062

因此，从0.15开始，您可以指定chunksize参数，例如只需:

So starting from 0.15, you can specify the chunksize argument and e.g. simply do:

df.to_sql('table', engine, chunksize=20000)

这篇关于Python Pandas-使用to_sql逐块写入大型数据帧的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Pandas-使用to_sql逐块写入大型数据帧 [英] Python Pandas - Using to_sql to write large data frames in chunks

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

Python Pandas-使用to_sql逐块写入大型数据帧 [英] Python Pandas - Using to_sql to write large data frames in chunks

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭