MongoDB:_id 不能是数组 [英] MongoDB: _id Cannot Be An Array

查看:41
本文介绍了MongoDB:_id 不能是数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型数据集(大约 110 万个文档),需要在其上运行 mapreduce.

I have a large dataset (about 1.1M documents) that I need to run mapreduce on.

要分组的字段是一个名为外部参照的数组.由于集合的大小以及我在 32 位环境中执行此操作的事实,我试图将集合减少到新数据库中的另一个集合.

The field to group on is an array named xref. Due to the size of the collection and the fact I'm doing this in a 32-bit environment, I'm trying to reduce the collection to another collection in a new database.

首先,这是一个数据样本:

First, here's a data sample:

{ "_id" : ObjectId("4ec6d3aa61910ad451f12e01"), 
"bii" : -32.9867, 
"class" : 2456, 
"decdeg" : -82.4856, 
"lii" : 297.4896,
"name" : "HD 22237", 
"radeg" : 50.3284,
"vmag" : 8, 
"xref" : ["HD 22237", "CPD -82   65", "-82   64","PPM 376283", "SAO 258336",
          "CP-82   65","GC 4125" ] }

{ "_id" : ObjectId("4ec6d44661910ad451f78eba"), 
  "bii" : -32.9901, 
  "class" : 2450, 
  "decdeg" : -82.4781, 
  "decpm" : 0.013,
  "lii" : 297.4807, 
  "name" : "PPM 376283", 
  "radeg" : 50.3543, 
  "rapm" : 0.0357, 
  "vmag" : 8.4, 
  "xref" : ["HD 22237", "CPD -82   65", "-82   64","PPM 376283", "SAO 258336",
          "CP-82   65","GC 4125" ] }

{ "_id" : ObjectId("4ec6d48a61910ad451feae04"), 
  "bii" : -32.9903, 
  "class" : 2450, 
  "decdeg" : -82.4779, 
  "decpm" : 0.027,
  "hd_component" : 0, 
  "lii" : 297.4806, 
  "name" : "SAO 258336", 
  "radeg" : 50.3543, 
  "rapm" : 0.0355, 
  "vmag" : 8, 
"xref" : ["HD 22237", "CPD -82   65", "-82   64","PPM 376283", "SAO 258336",
          "CP-82   65","GC 4125" ] }

这里是 map 和 reduce 函数(现在我只有 lii 和 bii 字段):

Here are the map and reduce functions (right now I'm only lii and bii fields):

function map() {
try {
    emit(this.xref, {lii:this.lii, bii:this.bii});
} catch(e) {
}
}

function reduce(key, values) {

var result = {xref:key, lii: 0.0, bii: 0.0};
try {
    values.forEach(function(value) {

        if (value.lii && value.bii) {
            result.lii += value.lii;
            result.bii += value.bii;
        }
    });

    result.bii /= values.length;
    result.lii /= values.length;
} catch(e) {
}

return result;
}

不幸的是,运行它最终会出现一条错误消息:

Unfortunately, running this eventually comes up with an error message:

db.catalog.mapReduce(map, reduce, {out:{replace:"catalog2", db:"astro2"}});

Wed Nov 23 10:12:25 uncaught exception: map reduce failed:{
    "assertion" : "_id cannot be an array",
    "assertionCode" : 10099,
    "errmsg" : "db assertion failure",
    "ok" : 0

外部参照字段是一个数组,但该数组中的所有值都相等.它是否试图将该数组用作新集合中的 id 字段?

The xref field IS an array, but all values are equal in that array. Is it trying to use that array as the id field in the new collections?

推荐答案

是的,不能将 _id 设置为数组,因为它具有特殊的索引行为.您发出的键用作输出集合中的 _id.如果结果很小,这可能仅适用于内联"输出模式,因为它不会进入集合.但理想情况下,您可以将数组转换为字符串(例如连接值)并将其用作 _id,或者使其成为子对象而不是数组.

Yes it is not possible to set _id as an array, because it has a special behavior for indexing. The key you emit by is used as _id in the output collection. Potentially this could work only with an "inline" output mode if the result is small, since it wont go to a collection. But ideally you would translate the array into a string (for example concat the values) and use that as _id, or make it a sub-object instead of an array.

另请注意,reduce 函数的结果不应包含键.只需返回 {lii: .., bii: ..}

Also note that the result of your reduce function should not include the key. Just return {lii: .., bii: ..}

这篇关于MongoDB:_id 不能是数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆