pymongo.errors.CursorNotFound:游标ID'...'在服务器上无效 [英] pymongo.errors.CursorNotFound: cursor id '...' not valid at server

查看:54
本文介绍了pymongo.errors.CursorNotFound:游标ID'...'在服务器上无效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用以下代码获取mongo数据库中存在的一些ID:

I am trying to fetch some ids that exist in a mongo database with the following code:

client = MongoClient('xx.xx.xx.xx', xxx)
db = client.test_database
db = client['...']
collection = db.test_collection
collection = db["..."]


for cursor in collection.find({ "$and" : [{ "followers" : { "$gt" : 2000 } }, { "followers" : { "$lt" : 3000 } }, { "list_followers" : { "$exists" : False } }] }): 
    print cursor['screenname']
    print cursor['_id']['uid']
    id = cursor['_id']['uid']

但是,过一会儿,我收到以下错误消息:

However, after a short while, I am receive this error:

pymongo.errors.CursorNotFound:游标ID'...'在服务器上无效.

pymongo.errors.CursorNotFound: cursor id '...' not valid at server.

我发现了文章,其中提到了该问题.不过,我不清楚该采取哪种解决方案.是否可以使用find().batch_size(30)?上面的命令到底是做什么的?我可以使用batch_size获取所有数据库ID吗?

I found this article which refers to that problem. Nevertheless it is not clear to me which solution to take. Is it possible to use find().batch_size(30)? What exactly does the above command do? Can I take all the database ids using batch_size?

推荐答案

由于光标在服务器上超时(不活动10分钟后),您将收到此错误.

You're getting this error because the cursor is timing out on the server (after 10 minutes of inactivity).

从pymongo文档中:

From the pymongo documentation:

如果已打开MongoDB中的游标,则它们可以在服务器上超时 很长时间没有对它们执行任何操作.这个可以 导致在尝试执行以下操作时引发CursorNotFound异常 迭代光标.

Cursors in MongoDB can timeout on the server if they’ve been open for a long time without any operations being performed on them. This can lead to an CursorNotFound exception being raised when attempting to iterate the cursor.

当调用collection.find方法时,它查询一个集合,并将光标返回到文档.要获取文档,您需要对光标进行迭代.当您遍历游标时,驱动程序实际上是在向MongoDB服务器发出请求以从服务器获取更多数据.每个请求中返回的数据量由batch_size()方法设置.

When you call the collection.find method it queries a collection and it returns a cursor to the documents. To get the documents you iterate the cursor. When you iterate over the cursor the driver is actually making requests to the MongoDB server to fetch more data from the server. The amount of data returned in each request is set by the batch_size() method.

文档:

限制一批中返回的文档数.每批 需要往返服务器.可以调整以优化 性能和限制数据传输.

Limits the number of documents returned in one batch. Each batch requires a round trip to the server. It can be adjusted to optimize performance and limit data transfer.

将batch_size设置为较小的值将帮助您解决超时错误错误,但会增加访问MongoDB服务器获取所有文档的次数.

Setting the batch_size to a lower value will help you with the timeout errors errors, but it will increase the number of times you're going to get access the MongoDB server to get all the documents.

默认批处理大小:

对于大多数查询,第一批返回101个文档或仅够 文件超过1兆字节.批处理大小不会超过BSON文档的最大大小(16 MB).

For most queries, the first batch returns 101 documents or just enough documents to exceed 1 megabyte. Batch size will not exceed the maximum BSON document size (16 MB).

没有通用的正确"批次大小.您应该使用不同的值进行测试,并查看适合您的用例的值是什么,即在一个10分钟的窗口中可以处理多少个文档.

There is no universal "right" batch size. You should test with different values and see what is the appropriate value for your use case i.e. how many documents can you process in a 10 minute window.

最后的选择就是设置no_cursor_timeout=True.但是,您需要确保在处理完数据后关闭游标.

The last resort will be that you set no_cursor_timeout=True. But you need to be sure that the cursor is closed after you finish processing the data.

如何在没有try/except的情况下避免出现这种情况:

How to avoid it without try/except:

cursor = collection.find(
     {"x": 1},
     no_cursor_timeout=True
)
for doc in cursor:
    # do something with doc
cursor.close()

这篇关于pymongo.errors.CursorNotFound:游标ID'...'在服务器上无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆