如何按1000逐块读取集合? [英] How to read through collection in chunks by 1000?
问题描述
我需要用Python代码从MongoDB中读取整个集合(集合名称为"test").我尝试过
I need to read whole collection from MongoDB ( collection name is "test" ) in Python code. I tried like
self.__connection__ = Connection('localhost',27017)
dbh = self.__connection__['test_db']
collection = dbh['test']
如何按1000逐块读取集合(以避免内存溢出,因为集合可能非常大)?
How to read through collection in chunks by 1000 ( to avoid memory overflow because collection can be very large ) ?
推荐答案
我同意Remon,但是您提到了1000个批次,但他的回答并未真正涵盖.您可以在光标上设置批量大小:
I agree with Remon, but you mention batches of 1000, which his answer doesn't really cover. You can set a batch size on the cursor:
cursor.batch_size(1000);
您还可以跳过记录,例如:
You can also skip records, e.g.:
cursor.skip(4000);
这是您要找的吗?这实际上是分页模式.但是,如果您只是想避免内存耗尽,则实际上不需要设置批处理大小或跳过.
Is this what you're looking for? This is effectively a pagination pattern. However, if you're just trying to avoid memory exhaustion then you don't really need to set batch size or skip.
这篇关于如何按1000逐块读取集合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!