无法让 allowDiskUse:True 与 pymongo 一起使用 [英] Can't get allowDiskUse:True to work with pymongo

查看:34
本文介绍了无法让 allowDiskUse:True 与 pymongo 一起使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

I'm running into the aggregation result exceeds maximum document size (16MB) error with mongodb aggregation using pymongo.

I was able to overcome it at first using the limit() option. However, at some point I got the

Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." error.

Ok, I'll use the {'allowDiskUse':True} option. This option works when I use it on the commandline, but when I tried to use in my python code

result = work1.aggregate(pipe, 'allowDiskUse:true')

I get TypeError: aggregate() takes exactly 2 arguments (3 given) error. (that's in spite of the definition given at http://api.mongodb.org/python/current/api/pymongo/collection.html#pymongo.collection.Collection.aggregate: aggregate(pipeline, **kwargs)).

I tried to use runCommand, or rather it's pymongo equivalent:

db.command('aggregate','work1',pipe, {'allowDiskUse':True})

but now I'm back to the 'aggregation result exceeds maximum document size (16MB)' error

In case you need to know

pipe = [{'$project': {'_id': 0, 'summary.trigrams': 1}}, {'$unwind': '$summary'}, {'$unwind': '$summary.trigrams'}, {'$group': {'count': {'$sum': 1}, '_id': '$summary.trigrams'}}, {'$sort': {'count': -1}}, {'$limit': 10000}]

Thank you

解决方案

So, in order:

  • aggregate is a method. It takes 2 positional arguments (self, which is implicitly passed, and pipeline) and any number of keyword arguments (which must be passed as foo=bar -- if there's no = sign, it's not a keyword argument). This means you need to call result = work1.aggregate(pipe, allowDiskUse=True).

  • Your error about maximum document size is inherent to Mongo. Mongo can never return a document (or array thereof) larger than 16 megabytes. I can't tell you why because you have given us neither your data nor your code, but it probably means that the document you're building as an end result is too large. Try decreasing the $limit parameter, maybe? Start by setting it to 1, run a test, then increase it and look at how big the result gets when you do that.

这篇关于无法让 allowDiskUse:True 与 pymongo 一起使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆