PyMongo引发[errno 49]大量查询后无法分配请求的地址 [英] PyMongo raises [errno 49] can't assign requested address after a large number of queries

查看:118
本文介绍了PyMongo引发[errno 49]大量查询后无法分配请求的地址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 MongoDB 集合,其中包含超过1,000,000个文档. 我正在执行初始的.find({ my_query })以返回这些文档的子集(约25,000个文档),然后将其放入list对象.

I have a MongoDB collection with > 1,000,000 documents. I am performing an initial .find({ my_query }) to return a subset of those documents (~25,000 documents), which I then put into a list object.

然后,我遍历每个对象,从列表中返回的文档中解析一些值,并通过代码使用这些解析后的值执行其他查询:

I am then looping over each of the objects, parsing some values from the returned document in the list, and performing an additional query using those parsed values via the code:

def _perform_queries(query):
    conn = pymongo.MongoClient('mongodb://localhost:27017')
    try:
        coll = conn.databases['race_results']
        races = coll.find(query).sort("date", -1)
    except BaseException, err:
        print('An error occured in runner query: %s\n' % err)
    finally:
        conn.close()
        return races

在这种情况下,我的query词典是:

In this case, my query dictionary is:

{"$and": [{"opponents":
    {"$elemMatch": {"$and": [
        {"runner.name": name},
        {"runner.jockey": jockey}
    ]}}},
    {"summary.dist": "1"}
]}

这是我的问题.我已经在opponents.runner.nameopponents.runner.jockey上创建了索引.这使得查询真的非常快.但是,在连续查询大约10,000次后, pymongo 引发了异常:

Here is my issue. I have created an index on opponents.runner.name and opponents.runner.jockey. This makes the queries really-really fast. However, after about 10,000 queries in a row, pymongo is raising an exception:

pymongo.errors.AutoReconnect: [Errno 49] Can't assign requested address

删除索引后,看不到此错误.但是每个查询大约需要0.5 seconds,对于我来说这是不可用的.

When I remove the index, I don't see this error. But it takes about 0.5 seconds per query, which is unusable in my case.

有人知道为什么会出现[Errno 49] can't assign requested address吗?我还看到了一些其他与can't assign requested address有关的问题,但与 pymongo 没有相关的问题,而且答案无济于事.

Does anyone know why the [Errno 49] can't assign requested address could be occurring? I've seen a few other SO questions related to can't assign requested address but not in relation to pymongo and there answers don't lead me anywhere.

更新:

按照以下Serge的建议,这是ulimit -a的输出:

Following Serge's advice below, here is the output of ulimit -a:

core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 2560
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited

我的MongoDB在OS X Yosemite上运行.

My MongoDB is running on OS X Yosemite.

推荐答案

这是因为您使用的PyMongo错误.您正在为每个查询创建一个新的MongoClient,这需要您为每个新查询打开一个新的套接字.这打败了PyMongo的连接池,除了非常慢之外,这还意味着您打开和关闭套接字的速度超过了TCP堆栈可以保持的速度:您将太多套接字保持为TIME_WAIT状态,从而最终耗尽了端口.

This is because you are using PyMongo incorrectly. You are creating a new MongoClient for each query, which requires you to open a new socket for each new query. This defeats PyMongo's connection pooling, and besides being extremely slow, it also means you open and close sockets faster than your TCP stack can keep up: you leave too many sockets in TIME_WAIT state so you eventually run out of ports.

幸运的是,修复很简单.创建一个MongoClient并在整个过程中使用它:

Luckily, the fix is simple. Create one MongoClient and use it throughout:

conn = pymongo.MongoClient('mongodb://localhost:27017')
coll = conn.databases['race_results']

def _perform_queries(query):
    return coll.find(query).sort("date", -1)

这篇关于PyMongo引发[errno 49]大量查询后无法分配请求的地址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆