Python + MongoDB - 游标迭代太慢 [英] Python + MongoDB - Cursor iteration too slow

查看:3149
本文介绍了Python + MongoDB - 游标迭代太慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我实际上在搜索引擎项目中工作。

我们使用的是python + mongoDb。

我遇到以下问题:

I'm actually working in a search engine project.
We are working with python + mongoDb.
I'm having the following problem:

我有一个pymongo游标,在对mongo db执行find()命令之后。

pymongo游标有大约20k个结果。

I have a pymongo cursor after excecuting a find() command to the mongo db.
The pymongo cursor have around 20k results.

我注意到pymongo游标上的迭代与正常迭代相比,例如相同大小的列表真的很慢。

I have noticed that the iteration over the pymongo cursor is really slow compared with a normal iteration over for example a list of the same size.

I做了一些基准:

- 在20k个字符串列表中进行测试:0.001492秒

- 在一个pymongo游标上进行20k次测试:1.445343秒

-iteration over a list of 20k strings: 0.001492 seconds
-iteration over a pymongo cursor with 20k results: 1.445343 seconds

差别真的很大。也许这个结果的数量不是一个问题,但如果我有数百万的结果,时间是不可接受的。

The difference is really a lot. Maybe not a problem with this amounts of results, but if I have millons of results the time would be unacceptable.

有人知道为什么pymongo游标太慢

Has anyone got an idea of why pymongo cursors are too slow to iterate?
Any idea of how can I iterate the cursor in less time?

有些额外的信息:


  • Python v2.6

  • PyMongo v1.9

  • MongoDB v1.6 32位

推荐答案

请记住,pymongo驱动程序并不会一次给您全部20k个结果。它正在进行网络调用mongodb后端更多的项目,你迭代。当然,它不会像字符串列表一样快。但是,建议您尝试调整光标batch_size,如api docs

Remember the pymongo driver is not giving you back all 20k results at once. It is making network calls to the mongodb backend for more items as you iterate. Of course it wont be as fast as a list of strings. However, I'd suggest trying to adjust the cursor batch_size as outlined in the api docs:

这篇关于Python + MongoDB - 游标迭代太慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆