pandas 通过read_sql_table使用过多内存 [英] Pandas using too much memory with read_sql_table

查看：124 发布时间：2020/5/24 1:53:33 python postgresql pandas sqlalchemy

本文介绍了 pandas 通过read_sql_table使用过多内存的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试将Postgres数据库中的表读入Python.该表大约有800万行和17列，数据库中的大小为622MB.

I am trying to read in a table from my Postgres database into Python. Table has around 8 million rows and 17 columns, and has a size of 622MB in the DB.

我可以使用psql将整个表导出到csv，然后使用pd.read_csv()读取它.它工作得很好. Python进程仅使用大约1GB的内存，一切都很好.

I can export the entire table to csv using psql, and then use pd.read_csv() to read it in. It works perfectly fine. Python process only uses around 1GB of memory and everything is good.

现在，我们需要完成的任务要求这种拉动是自动化的，因此我认为我可以直接从数据库中使用pd.read_sql_table()来读取表.使用以下代码

Now, the task we need to do requires this pull to be automated, so I thought I could read the table in using pd.read_sql_table() directly from the DB. Using the following code

import sqlalchemy
engine = sqlalchemy.create_engine("postgresql://username:password@hostname:5432/db")
the_frame = pd.read_sql_table(table_name='table_name', con=engine,schema='schemaname')

这种方法开始占用大量内存.当我使用任务管理器跟踪内存使用情况时，我可以看到Python进程的内存使用率不断攀升，直到达到16GB并冻结计算机为止.

This approach starts using a lot of memory. When I track the memory usage using Task Manager, I can see the Python process memory usage climb and climb, until it hits all the way up to 16GB and freezes the computer.

对于为什么会发生这种情况的任何想法都表示赞赏.

Any ideas on why this might be happening is appreciated.

pandas 通过read_sql_table使用过多内存 [英] Pandas using too much memory with read_sql_table

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 通过read_sql_table使用过多内存 [英] Pandas using too much memory with read_sql_table

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭