避免聚合16MB限制 [英] Avoid Aggregate 16MB Limit

查看:72
本文介绍了避免聚合16MB限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收集了大约1M个文档.每个文档都有internalNumber属性,我需要在我的node.js代码中获取所有internalNumber.

I have a collection of about 1M documents. Each document has internalNumber property and I need to get all internalNumbers in my node.js code.

以前我在使用

db.docs.distinct("internalNumber")

collection.distinct('internalNumber', {}, {},(err, result) => { /* ... */ })

在节点中.

但是随着集合的增长,我开始出现错误:distinct is too big, 16m cap.

But with the growth of the collection I started to get the error: distinct is too big, 16m cap.

现在我要使用聚合.它消耗大量内存,速度很慢,但是可以,因为我只需要在脚本启动时执行一次即可.我已经尝试在Robo 3T GUI工具中进行以下操作:

Now I want to use aggregation. It consumes a lot of memory and it is slow, but it is OK since I need to do it only once at the script startup. I've tried following in Robo 3T GUI tool:

db.docs.aggregate([{$group: {_id: '$internalNumber'} }]); 

它可以工作,我想通过以下方式在node.js代码中使用它:

It works, and I wanted to use it in node.js code the following way:

collection.aggregate([{$group: {_id: '$internalNumber'} }],
  (err, docs) => { /* ... * });

但是在Node中,我得到一个错误:"MongoError: aggregation result exceeds maximum document size (16MB) at Function.MongoError.create".

But in Node I get an error: "MongoError: aggregation result exceeds maximum document size (16MB) at Function.MongoError.create".

请帮助克服该限制.

推荐答案

问题在于,本机驱动程序与默认情况下shell方法的工作方式不同,因为"shell"实际上返回的是"cursor"对象,其中本机驱动程序明确"需要此选项.

The problem is that the native driver differs from how the shell method is working by default in that the "shell" is actually returning a "cursor" object where the native driver needs this option "explicitly".

没有光标", .aggregate() 将单个BSON文档作为文档数组返回,因此我们将其变成游标以避免这种限制:

Without a "cursor", .aggregate() returns a single BSON document as an array of documents, so we turn it into a cursor to avoid the limitation:

let cursor = collection.aggregate(
  [{ "$group": { "_id": "$internalNumber" } }],
  { "cursor": { "batchSize": 500 } }
);

cursor.toArray((err,docs) => {
   // work with resuls
});

然后,您可以使用常规方法,例如 ,以使结果成为一个JavaScript数组,该JavaScript数组在客户端"上没有相同的限制,或使用其他方法来迭代.

Then you can use regular methods like .toArray() to make the results a JavaScript array which on the 'client' does not share the same limitations, or other methods for iterating a "cursor".

这篇关于避免聚合16MB限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆