按组连接字符串 [英] Concat String by Group

查看:19
本文介绍了按组连接字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想按 _id 对记录进行分组,并通过组合 client_id 值创建一个字符串.

I want to group records by _id and create a string by combining client_id values.

以下是我的文档示例:

{
  "_id" : ObjectId("59e955e633d64c81875bfd2f"),
  "tag_id" : 1,
  "client_id" : "10001"
}
{
  "_id" : ObjectId("59e955e633d64c81875bfd30"),
  "tag_id" : 1,
  "client_id" : "10002"
}

我想要这个输出:

{
  "_id" : 1
  "client_id" : "10001,10002"
}

推荐答案

您可以将聚合框架作为两步"操作来完成.首先通过 $push<将项目累积到一个数组中/code> 带有 $group 管道,然后使用 $concat$reduce 在最终投影中生成的数组上:

You can do it with the aggregation framework as a "two step" operation. Which is to first accumulate the items to an array via $push withing a $group pipeline, and then to use $concat with $reduce on the produced array in final projection:

db.collection.aggregate([
  { "$group": {
    "_id": "$tag_id",
    "client_id": { "$push": "$client_id" }
  }},
  { "$addFields": {
    "client_id": {
      "$reduce": {
        "input": "$client_id",
        "initialValue": "",
        "in": {
          "$cond": {
            "if": { "$eq": [ "$$value", "" ] },
            "then": "$$this",
            "else": {
              "$concat": ["$$value", ",", "$$this"]
            }
          }
        }
      }
    }
  }}
])

我们还应用 $cond 这里是为了避免在结果中用逗号连接空字符串,所以看起来更像是一个分隔列表.

We also apply $cond here to avoid concatenating an empty string with a comma in the results, so it looks more like a delimited list.

仅供参考,有一个 JIRA 问题 SERVER-29339 确实要求 $reduce 被实现为 累加器表达式 允许直接在 $group 管道阶段.不太可能很快发生,但理论上它将取代 $push 使操作成为单个流水线阶段.建议的语法示例在 JIRA 问题上.

FYI There is an JIRA issue SERVER-29339 which does ask for $reduce to be implemented as an accumulator expression to allow it's use directly in a $group pipeline stage. Not likely to happen any time soon, but it theoretically would replace $push in the above and make the operation a single pipeline stage. Sample proposed syntax is on the JIRA issue.

如果你没有 $reduce(需要 MongoDB 3.4)然后只需对光标进行后处理:

If you don't have $reduce ( requires MongoDB 3.4 ) then just post process the cursor:

db.collection.aggregate([
  { "$group": {
    "_id": "$tag_id",
    "client_id": { "$push": "$client_id" }
  }},
]).map( doc =>
  Object.assign(
    doc,
   { "client_id": doc.client_id.join(",") }
  )
)

然后导致使用 mapReduce 如果你真的需要:

Which then leads to the other alternative of doing this using mapReduce if you really must:

db.collection.mapReduce(
  function() {
    emit(this.tag_id,this.client_id);
  },
  function(key,values) {
    return [].concat.apply([],values.map(v => v.split(","))).join(",");
  },
  { "out": { "inline": 1 } }
)

当然是以_idvalue的具体mapReduce形式作为key的集合输出的,但基本都是输出.

Which of course outputs in the specific mapReduce form of _id and value as the set of keys, but it is basically the output.

我们使用 [].concat.apply([],values.map(...)) 因为reducer"的输出可以是分隔字符串",因为 mapReduce 以增量方式处理大结果,因此reducer 的输出可以在另一遍中成为输入".所以我们需要预料到这种情况会发生并相应地对待它.

We use [].concat.apply([],values.map(...)) because the output of the "reducer" can be a "delimited string" because mapReduce works incrementally with large results and therefore output of the reducer can become "input" on another pass. So we need to expect that this can happen and treat it accordingly.

这篇关于按组连接字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆