需要对使用mongodb聚合查询从另一个集合连接的多个字段进行不同的计数 [英] Need a distinct count on multiple fields that were joined from another collection using mongodb aggregation query

查看:132
本文介绍了需要对使用mongodb聚合查询从另一个集合连接的多个字段进行不同的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用mongodb聚合查询来联接($ lookup)两个集合,然后分别计算联接数组中的所有唯一值.

I'm trying to use a mongodb aggregation query to join($lookup) two collections and then distinct count all the unique values in the joined array.

所以我的两个收藏看起来像这样: 事件-

So my two collections look like this: events-

{
    "_id" : "1",
    "name" : "event1",
    "objectsIds" : [ "1", "2", "3" ],
}

对象

{
    "_id" : "1",
    "name" : "object1",
    "metaDataMap" : { 
                         "SOURCE" : ["ABC", "DEF"],
                         "DESTINATION" : ["XYZ", "PDQ"],
                         "TYPE" : []
                    }
},
{
    "_id" : "2",
    "name" : "object2",
    "metaDataMap" : { 
                         "SOURCE" : ["RST", "LNE"],
                         "TYPE" : ["text"]
                    }
},
{
    "_id" : "3",
    "name" : "object3",
    "metaDataMap" : { 
                         "SOURCE" : ["NOP"],
                         "DESTINATION" : ["PHI", "NYC"],
                         "TYPE" : ["video"]
                    }
}

我想展示的是当我对事件_id = 1进行$ match时,我想加入metaDataMap,然后像这样对所有键进行计数: 事件_id = 1的计数

What I want to come out is when I do a $match on event _id=1 I want to join the metaDataMap and then distinct count all the keys like this: Counts for event _id=1

SOURCE : 5
DESTINATION: 4
TYPE: 2

到目前为止,我是这样的:

What I have so far is this:

db.events.aggregate([
 {$match: {"_id" : id}}
,{$lookup: {"from" : "objects",
            "localField" : "objectsIds",
            "foreignField" : "_id",
            "as" : "objectResults"}}
,{$project: {x: {$objectToArray: "$objectResults.metaDataMap"}}}
,{$unwind: "$x"}
,{$match: {"x.k": {$ne: "_id"}}}
,{$group: {_id: "$x.k", y: {$addToSet: "$x.v"}}}
,{$addFields: {size: {"$size":"$y"}} }
]);

这失败,因为$ objectResults.metaDataMap不是一个对象,它是一个数组.关于如何解决此问题或以其他方式完成我想做的事情的任何建议? 我也不一定知道metaDataMap数组中的哪些字段(键).而且我不想统计或包括地图中可能存在或可能不存在的字段.

This fails because $objectResults.metaDataMap is not an object it's an array. Any suggestions on how to solve this or a different way to do what I want to do? Also I don't necessarily know what fields(keys) are in the metaDataMap array. And I don't want to count or include fields that might or might not exist in the Map.

推荐答案

这应该可以解决问题.我在您的输入集上对其进行了测试,并故意添加了一些重复值,例如NYC出现在多个DESTINATION中,以确保将其重复数据删除(即按要求分配不同的计数). 为了娱乐起见,请注释掉所有阶段,然后从上至下注释掉注释,以查看管道的每个阶段的效果.

This should do the trick. I tested it on your input set and deliberately added some dupe values like NYCshowing up in more than one DESTINATIONto ensure it got de-duped (i.e. distinct count as asked for). For fun, comment out all the stages, then top down UNcomment it out to see the effect of each stage of the pipeline.

var id = "1";

c=db.foo.aggregate([
// Find a thing:
{$match: {"_id" : id}}

// Do the lookup into the objects collection:
,{$lookup: {"from" : "foo2",
            "localField" : "objectsIds",
            "foreignField" : "_id",
            "as" : "objectResults"}}

// OK, so we've got a bunch of extra material now.  Let's
// get down to just the metaDataMap:
,{$project: {x: "$objectResults.metaDataMap"}}
,{$unwind: "$x"}
,{$project: {"_id":0}}

// Use $objectToArray to get all the field names dynamically:
// Replace the old x with new x (don't need the old one):
,{$project: {x: {$objectToArray: "$x"}}}
,{$unwind: "$x"}

// Collect unique field names.  Interesting note: the values
// here are ARRAYS, not scalars, so $push is creating an
// array of arrays:
,{$group: {_id: "$x.k", tmp: {$push: "$x.v"}}}

// Almost there!  We have to turn the array of array (of string)
// into a single array which we'll subsequently dedupe.  We will
// overwrite the old tmp with a new one, too:
,{$addFields: {tmp: {$reduce:{
    input: "$tmp",
    initialValue:[],
    in:{$concatArrays: [ "$$value", "$$this"]}
        }}
    }}

// Now just unwind and regroup using the addToSet operator
// to dedupe the list:
,{$unwind: "$tmp"}
,{$group: {_id: "$_id", uniqueVals: {$addToSet: "$tmp"}}}

// Add size for good measure:
,{$addFields: {size: {"$size":"$uniqueVals"}} }
          ]);

这篇关于需要对使用mongodb聚合查询从另一个集合连接的多个字段进行不同的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆