Python3 - 有没有办法在非常大的 SQlite 表上逐行迭代而不将整个表加载到本地内存中? [英] Python3 - Is there a way to iterate row by row over a very large SQlite table without loading the entire table into local memory?

查看:25
本文介绍了Python3 - 有没有办法在非常大的 SQlite 表上逐行迭代而不将整个表加载到本地内存中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 250,000 多行的非常大的表格,其中许多行在其中一列中包含一个大文本块.现在它是 2.7GB,预计至少会增长十倍.我需要对表的每一行执行 python 特定的操作,但一次只需要访问一行.

I have a very large table with 250,000+ rows, many containing a large text block in one of the columns. Right now it's 2.7GB and expected to grow at least tenfold. I need to perform python specific operations on every row of the table, but only need to access one row at a time.

现在我的代码看起来像这样:

Right now my code looks something like this:

c.execute('SELECT * FROM big_table') 
table = c.fetchall()
for row in table:
    do_stuff_with_row

当表较小时,这工作正常,但是当我尝试运行它时,该表现在大于我可用的 ram 并且 python 挂起.有没有更好的(更高效的)方法来逐行迭代整个表?

This worked fine when the table was smaller, but the table is now larger than my available ram and python hangs when I try and run it. Is there a better (more ram efficient) way to iterate row by row over the entire table?

推荐答案

cursor.fetchall() 首先将所有结果提取到列表中.

cursor.fetchall() fetches all results into a list first.

相反,您可以遍历游标本身:

Instead, you can iterate over the cursor itself:

c.execute('SELECT * FROM big_table') 
for row in c:
    # do_stuff_with_row

这会根据需要生成行,而不是先加载所有行.

This produces rows as needed, rather than load them all first.

这篇关于Python3 - 有没有办法在非常大的 SQlite 表上逐行迭代而不将整个表加载到本地内存中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆