仅返回嵌套数组中匹配的子文档元素 [英] Return only matched sub-document elements within a nested array

查看:65
本文介绍了仅返回嵌套数组中匹配的子文档元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

主要收藏是零售商,其中包含商店数组.每个商店都包含一系列优惠(您可以在此商店中购买).此提供的数组具有大小的数组. (请参见下面的示例)

The main collection is retailer, which contains an array for stores. Each store contains an array of offers (you can buy in this store). This offers array has an array of sizes. (See example below)

现在,我尝试查找所有报价,这些报价的大小为L.

Now I try to find all offers, which are available in the size L.

{
    "_id" : ObjectId("56f277b1279871c20b8b4567"),
    "stores" : [
        {
        "_id" : ObjectId("56f277b5279871c20b8b4783"),
        "offers" : [
            {
                "_id" : ObjectId("56f277b1279871c20b8b4567"),
                "size": [
                    "XS",
                    "S",
                    "M"
                ]
            },
            {
                "_id" : ObjectId("56f277b1279871c20b8b4567"),
                "size": [
                    "S",
                    "L",
                    "XL"
                ]
            }
        ]
    }
}

我已经尝试过以下查询:db.getCollection('retailers').find({'stores.offers.size': 'L'})

I've try this query: db.getCollection('retailers').find({'stores.offers.size': 'L'})

我希望这样的输出:

 {
"_id" : ObjectId("56f277b1279871c20b8b4567"),
"stores" : [
    {
        "_id" : ObjectId("56f277b5279871c20b8b4783"),
        "offers" : [
            {
                "_id" : ObjectId("56f277b1279871c20b8b4567"),
                "size": [
                    "S",
                    "L",
                    "XL"
                ]
            }
        ]
    }
}

但是我查询的输出还包含与size XS,X和M不匹配的报价.

But the Output of my Query contains also the non matching offer with size XS,X and M.

如何强制MongoDB仅返回与我的查询匹配的商品?

How I can force MongoDB to return only the offers, which matched my query?

问候和感谢.

推荐答案

因此,您实际上已经在查询中选择了应有的文档".但是,您要寻找的是过滤包含的数组",以便返回的元素仅与查询条件匹配.

So the query you have actually selects the "document" just like it should. But what you are looking for is to "filter the arrays" contained so that the elements returned only match the condition of the query.

当然,真正的答案是,除非您通过过滤掉这样的细节确实节省了很多带宽,否则您甚至不应该尝试,或者至少不超过第一个位置匹配.

The real answer is of course that unless you are really saving a lot of bandwidth by filtering out such detail then you should not even try, or at least beyond the first positional match.

MongoDB具有位置$运算符,该运算符将返回查询条件中匹配索引处的数组元素.但是,这只会返回最外面"的数组元素的第一个"匹配索引.

MongoDB has a positional $ operator which will return an array element at the matched index from a query condition. However, this only returns the "first" matched index of the "outer" most array element.

db.getCollection('retailers').find(
    { 'stores.offers.size': 'L'},
    { 'stores.$': 1 }
)

在这种情况下,它仅表示"stores"数组的位置.因此,如果存在多个商店"条目,则仅返回包含匹配条件的元素中的一个". 但是,这对"offers"的内部数组没有任何作用,因此匹配的"stores"数组中的每个要约"仍将返回.

In this case, it means the "stores" array position only. So if there were multiple "stores" entries, then only "one" of the elements that contained your matched condition would be returned. But, that does nothing for the inner array of "offers", and as such every "offer" within the matchd "stores" array would still be returned.

MongoDB无法在标准查询中对此进行过滤",因此以下操作无效:

MongoDB has no way of "filtering" this in a standard query, so the following does not work:

db.getCollection('retailers').find(
    { 'stores.offers.size': 'L'},
    { 'stores.$.offers.$': 1 }
)

MongoDB实际上唯一需要执行此级别操作的工具是聚合框架.但是分析应该向您显示为什么您大概"不应该这样做,而只是在代码中过滤数组.

The only tools MongoDB actually has to do this level of manipulation is with the aggregation framework. But the analysis should show you why you "probably" should not do this, and instead just filter the array in code.

按照每个版本的实现顺序.

In order of how you can achieve this per version.

首先使用 $filter 操作使用 MongoDB 3.2.x :

db.getCollection('retailers').aggregate([
  { "$match": { "stores.offers.size": "L" } },
  { "$project": {
    "stores": {
      "$filter": {
        "input": {
          "$map": {
            "input": "$stores",
            "as": "store",
            "in": {
              "_id": "$$store._id",
              "offers": {
                "$filter": {
                  "input": "$$store.offers",
                  "as": "offer",
                  "cond": {
                    "$setIsSubset":  [ ["L"], "$$offer.size" ]
                  }
                }
              }
            }
          }
        },
        "as": "store",
        "cond": { "$ne": [ "$$store.offers", [] ]}
      }
    }
  }}
])

然后使用 MongoDB 2.6.x 及更高版本以及 $map $setDifference :

Then with MongoDB 2.6.x and above with $map and $setDifference:

db.getCollection('retailers').aggregate([
  { "$match": { "stores.offers.size": "L" } },
  { "$project": {
    "stores": {
      "$setDifference": [
        { "$map": {
          "input": {
            "$map": {
              "input": "$stores",
              "as": "store",
              "in": {
                "_id": "$$store._id",
                "offers": {
                  "$setDifference": [
                    { "$map": {
                      "input": "$$store.offers",
                      "as": "offer",
                      "in": {
                        "$cond": {
                          "if": { "$setIsSubset": [ ["L"], "$$offer.size" ] },
                          "then": "$$offer",
                          "else": false
                        }
                      }
                    }},
                    [false]
                  ]
                }
              }
            }
          },
          "as": "store",
          "in": {
            "$cond": {
              "if": { "$ne": [ "$$store.offers", [] ] },
              "then": "$$store",
              "else": false
            }
          }
        }},
        [false]
      ]
    }
  }}
])

最后是在引入聚合框架的 MongoDB 2.2.x 以上的任何版本中.

And finally in any version above MongoDB 2.2.x where the aggregation framework was introduced.

db.getCollection('retailers').aggregate([
  { "$match": { "stores.offers.size": "L" } },
  { "$unwind": "$stores" },
  { "$unwind": "$stores.offers" },
  { "$match": { "stores.offers.size": "L" } },
  { "$group": {
    "_id": {
      "_id": "$_id",
      "storeId": "$stores._id",
    },
    "offers": { "$push": "$stores.offers" }
  }},
  { "$group": {
    "_id": "$_id._id",
    "stores": {
      "$push": {
        "_id": "$_id.storeId",
        "offers": "$offers"
      }
    }
  }}
])

让我们分解一下解释.

所以一般来说, $filter 是请在这里进行,因为它的设计考虑了目标.由于阵列有多个级别,因此您需要在每个级别上应用它.因此,首先您要深入研究"stores"中的每个"offers",以检查和$filter该内容.

So generally speaking, $filter is the way to go here since it is designed with the purpose in mind. Since there are multiple levels of the array, you need to apply this at each level. So first you are diving into each "offers" within "stores" to examime and $filter that content.

这里的简单比较是""size"数组是否包含我要查找的元素" .在此逻辑上下文中,要做的短小的事情是使用 $setIsSubset 操作,将["L"]的数组(集合")与目标数组进行比较.如果条件是true(它包含"L"),那么将保留"offers"的数组元素并在结果中返回.

The simple comparison here is "Does the "size" array contain the element I am looking for". In this logical context, the short thing to do is use the $setIsSubset operation to compare an array ("set") of ["L"] to the target array. Where that condition is true ( it contains "L" ) then the array element for "offers" is retained and returned in the result.

在更高级别的$filter中,您接下来要查看的是先前$filter的结果是否为"offers"返回了空数组[].如果不为空,则返回该元素,否则将其删除.

In the higher level $filter, you are then looking to see if the result from that previous $filter returned an empty array [] for "offers". If it is not empty, then the element is returned or otherwise it is removed.

这与现代过程非常相似,不同之处在于,由于此版本中没有$filter,因此您可以使用 $setDifference 过滤掉以false返回的所有元素.

This is very similar to the modern process except that since there is no $filter in this version you can use $map to inspect each element and then use $setDifference to filter out any elements that were returned as false.

所以$map将要返回整个数组,但是$cond操作仅决定是返回元素还是返回false值.在将$setDifference[false]的单个元素集合"进行比较时,将删除返回数组中的所有false元素.

So $map is going to return the whole array, but the $cond operation just decides whether to return the element or instead a false value. In the comparison of $setDifference to a single element "set" of [false] all false elements in the returned array would be removed.

在所有其他方面,逻辑与上面相同.

In all other ways, the logic is the same as above.

因此,在MongoDB 2.6以下,用于处理数组的唯一工具是 $unwind ,仅出于此目的,您请勿为此目的而仅使用聚合框架.

So below MongoDB 2.6 the only tool for working with arrays is $unwind, and for this purpose alone you should not use the aggregation framework "just" for this purpose.

通过简单地分解"每个数组,过滤掉不需要的东西,然后将它们放在一起,过程确实看起来很简单.主要注意事项是在两个" $group 阶段,使用第一个"重新构建内部数组,然后使用第一个"重新构建外部数组.在所有级别上都有不同的_id值,因此只需要在分组的每个级别中将它们包括在内.

The process indeed appears simple, by simply "taking apart" each array, filtering out the things you don't need then putting it back together. The main care is in the "two" $group stages, with the "first" to re-build the inner array, and the next to re-build the outer array. There are distinct _id values at all levels, so these just need to be included at every level of grouping.

但是问题在于$unwind 非常昂贵.尽管它确实还有目的,但其主要用途是不对每个文档进行这种过滤.实际上,在现代发行版中,唯一的用途应该是当数组的元素需要成为分组键"本身的一部分时.

But the problem is that $unwind is very costly. Though it does have purpose still, it's main usage intent is not to do this sort of filtering per document. In fact in modern releases it's only usage should be when an element of the array(s) needs to become part of the "grouping key" itself.

因此在这样的数组的多个级别上进行匹配不是一个简单的过程,实际上,如果实施不正确,它可能会极其昂贵.

So it's not a simple process to get matches at multiple levels of an array like this, and in fact it can be extremely costly if implemented incorrectly.

只有两个现代列表才可以用于此目的,因为它们除了使用查询" $match之外还采用单个"管道阶段来进行过滤".产生的效果比标准形式的.find()少.

Only the two modern listings should ever be used for this purpose, as they employ a "single" pipeline stage in addition to the "query" $match in order to do the "filtering". The resulting effect is little more overhead than the standard forms of .find().

尽管如此,这些列表通常仍然具有一定的复杂性,并且除非您确实要以这种方式极大地减少这种过滤所返回的内容,从而显着提高服务器和客户端之间使用的带宽,然后,您最好对初始查询和基本预测的结果进行过滤.

In general though, those listings still have an amount of complexity to them, and indeed unless you are really drastically reducing the content returned by such filtering in a way that makes a significant improvement in bandwidth used between the server and client, then you are better of filtering the result of the initial query and basic projection.

db.getCollection('retailers').find(
    { 'stores.offers.size': 'L'},
    { 'stores.$': 1 }
).forEach(function(doc) {
    // Technically this is only "one" store. So omit the projection
    // if you wanted more than "one" match
    doc.stores = doc.stores.filter(function(store) {
        store.offers = store.offers.filter(function(offer) {
            return offer.size.indexOf("L") != -1;
        });
        return store.offers.length != 0;
    });
    printjson(doc);
})

因此,与使用聚合管道来执行此操作相比,使用返回的对象后"查询处理要轻松得多.如前所述,唯一的实际"差异是您正在丢弃服务器"上的其他元素,而不是在接收到每个文档"时将其删除,这样可以节省一点带宽.

So working with the returned object "post" query processing is far less obtuse than using the aggregation pipeline to do this. And as stated the only "real" diffrerence would be that you are discarding the other elements on the "server" as opposed to removing them "per document" when received, which may save a little bandwidth.

但是,除非您在仅使用 $match$project的现代版本中执行此操作,否则服务器上处理的成本"将大大超过减少该成本的收益"首先删除不匹配的元素,以节省网络开销.

But unless you are doing this in a modern release with only $match and $project, then the "cost" of processing on the server will greatly outweigh the "gain" of reducing that network overhead by stripping the unmatched elements first.

在所有情况下,您都会得到相同的结果:

In all cases, you get the same result:

{
        "_id" : ObjectId("56f277b1279871c20b8b4567"),
        "stores" : [
                {
                        "_id" : ObjectId("56f277b5279871c20b8b4783"),
                        "offers" : [
                                {
                                        "_id" : ObjectId("56f277b1279871c20b8b4567"),
                                        "size" : [
                                                "S",
                                                "L",
                                                "XL"
                                        ]
                                }
                        ]
                }
        ]
}

这篇关于仅返回嵌套数组中匹配的子文档元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆