按组连接字符串 [英] Concat String by Group
问题描述
我想按 _id
对记录进行分组,并通过组合 client_id
值创建一个字符串.
I want to group records by _id
and create a string by combining client_id
values.
以下是我的文档示例:
{
"_id" : ObjectId("59e955e633d64c81875bfd2f"),
"tag_id" : 1,
"client_id" : "10001"
}
{
"_id" : ObjectId("59e955e633d64c81875bfd30"),
"tag_id" : 1,
"client_id" : "10002"
}
我想要这个输出:
{
"_id" : 1
"client_id" : "10001,10002"
}
推荐答案
您可以将聚合框架作为两步"操作来完成.首先通过 $push<将项目累积到一个数组中/code>
带有 $group
管道,然后使用 $concat
与 $reduce
在最终投影中生成的数组上:
You can do it with the aggregation framework as a "two step" operation. Which is to first accumulate the items to an array via $push
withing a $group
pipeline, and then to use $concat
with $reduce
on the produced array in final projection:
db.collection.aggregate([
{ "$group": {
"_id": "$tag_id",
"client_id": { "$push": "$client_id" }
}},
{ "$addFields": {
"client_id": {
"$reduce": {
"input": "$client_id",
"initialValue": "",
"in": {
"$cond": {
"if": { "$eq": [ "$$value", "" ] },
"then": "$$this",
"else": {
"$concat": ["$$value", ",", "$$this"]
}
}
}
}
}
}}
])
我们还应用 $cond
这里是为了避免在结果中用逗号连接空字符串,所以看起来更像是一个分隔列表.
We also apply $cond
here to avoid concatenating an empty string with a comma in the results, so it looks more like a delimited list.
仅供参考,有一个 JIRA 问题 SERVER-29339 确实要求 $reduce
被实现为 累加器表达式 允许直接在 $group
管道阶段.不太可能很快发生,但理论上它将取代 $push
使操作成为单个流水线阶段.建议的语法示例在 JIRA 问题上.
FYI There is an JIRA issue SERVER-29339 which does ask for $reduce
to be implemented as an accumulator expression to allow it's use directly in a $group
pipeline stage. Not likely to happen any time soon, but it theoretically would replace $push
in the above and make the operation a single pipeline stage. Sample proposed syntax is on the JIRA issue.
如果你没有 $reduce
(需要 MongoDB 3.4)然后只需对光标进行后处理:
If you don't have $reduce
( requires MongoDB 3.4 ) then just post process the cursor:
db.collection.aggregate([
{ "$group": {
"_id": "$tag_id",
"client_id": { "$push": "$client_id" }
}},
]).map( doc =>
Object.assign(
doc,
{ "client_id": doc.client_id.join(",") }
)
)
然后导致使用 mapReduce
如果你真的需要:
Which then leads to the other alternative of doing this using mapReduce
if you really must:
db.collection.mapReduce(
function() {
emit(this.tag_id,this.client_id);
},
function(key,values) {
return [].concat.apply([],values.map(v => v.split(","))).join(",");
},
{ "out": { "inline": 1 } }
)
当然是以_id
和value
的具体mapReduce
形式作为key的集合输出的,但基本都是输出.
Which of course outputs in the specific mapReduce
form of _id
and value
as the set of keys, but it is basically the output.
我们使用 [].concat.apply([],values.map(...))
因为reducer"的输出可以是分隔字符串",因为 mapReduce
以增量方式处理大结果,因此reducer 的输出可以在另一遍中成为输入".所以我们需要预料到这种情况会发生并相应地对待它.
We use [].concat.apply([],values.map(...))
because the output of the "reducer" can be a "delimited string" because mapReduce
works incrementally with large results and therefore output of the reducer can become "input" on another pass. So we need to expect that this can happen and treat it accordingly.
这篇关于按组连接字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!