MongoDB中的多重限制条件 [英] Multiple limit condition in mongodb

查看:40
本文介绍了MongoDB中的多重限制条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个集合,其中一个字段是类型".我想根据所有类型相同的条件获取每种类型的一些值.就像我想要 A 类型的 2 个文档,B 类型的 2 个文档.如何在单个查询中执行此操作?我正在使用 Ruby Active Record.

I have a collection in which one of the field is "type". I want to get some values of each type depending upon condition which is same for all the types. Like I want 2 documents for type A, 2 for type B like that. How to do this in a single query? I am using Ruby Active Record.

推荐答案

一般来说,您所描述的是一个在 MongoDB 社区中相对常见的问题,我们可以将其描述为顶级 n 结果问题".这是当给定一些可能以某种方式排序的输入时,如何在不依赖数据中任意索引值的情况下获得顶部 n 结果.

Generally what you are describing is a relatively common question around the MongoDB community which we could describe as the "top n results problem". This is when given some input that is likely sorted in some way, how to get the top n results without relying on arbitrary index values in the data.

MongoDB 有 $first 运算符可用于 聚合框架 处理问题的前 1"部分,因为这实际上采用了在分组边界上找到的第一个"项目,例如您的类型".但是,获得超过一个"的结果当然会涉及更多.关于修改其他运算符以处理 n 结果或限制"或切片"有一些 JIRA 问题.值得注意的是 SERVER-6074.但是这个问题可以通过几种方式来解决.

MongoDB has the $first operator which is available to the aggregation framework which deals with the "top 1" part of the problem, as this actually takes the "first" item found on a grouping boundary, such as your "type". But getting more than "one" result of course gets a little more involved. There are some JIRA issues on this about modifying other operators to deal with n results or "restrict" or "slice". Notably SERVER-6074. But the problem can be handled in a few ways.

用于 MongoDB 存储的 Rails Active Record 模式的流行实现是 MongoidMongo Mapper,都允许通过 .collection 访问器访问本机"mongodb 集合函数.这是您基本上需要能够使用本机方法,例如 .aggregate() 支持比一般 Active Record 聚合更多的功能.

Popular implementations of the rails Active Record pattern for MongoDB storage are Mongoid and Mongo Mapper, both allow access to the "native" mongodb collection functions via a .collection accessor. This is what you basically need to be able to use native methods such as .aggregate() which supports more functionality than general Active Record aggregation.

这是一种使用 mongoid 的聚合方法,尽管一旦您可以访问本机集合对象,通用代码就不会改变:

Here is an aggregation approach with mongoid, though the general code does not alter once you have access to the native collection object:

require "mongoid"
require "pp";

Mongoid.configure.connect_to("test");

class Item
  include Mongoid::Document
  store_in collection: "item"

  field :type, type: String
  field :pos, type: String
end

Item.collection.drop

Item.collection.insert( :type => "A", :pos => "First" )
Item.collection.insert( :type => "A", :pos => "Second"  )
Item.collection.insert( :type => "A", :pos => "Third" )
Item.collection.insert( :type => "A", :pos => "Forth" )
Item.collection.insert( :type => "B", :pos => "First" )
Item.collection.insert( :type => "B", :pos => "Second" )
Item.collection.insert( :type => "B", :pos => "Third" )
Item.collection.insert( :type => "B", :pos => "Forth" )

res = Item.collection.aggregate([
  { "$group" => {
      "_id" => "$type",
      "docs" => {
        "$push" => {
          "pos" => "$pos", "type" => "$type"
        }
      },
      "one" => {
        "$first" => {
          "pos" => "$pos", "type" => "$type"
        }
      }
  }},
  { "$unwind" =>  "$docs" },
  { "$project" => {
    "docs" => {
      "pos" => "$docs.pos",
      "type" => "$docs.type",
      "seen" => {
        "$eq" => [ "$one", "$docs" ]
      },
    },
    "one" => 1
  }},
  { "$match" => {
    "docs.seen" => false
  }},
  { "$group" => {
    "_id" => "$_id",
    "one" => { "$first" => "$one" },
    "two" => {
      "$first" => {
        "pos" => "$docs.pos",
        "type" => "$docs.type"
      }
    },
    "splitter" => {
      "$first" => {
        "$literal" => ["one","two"]
      }
    }
  }},
  { "$unwind" => "$splitter" },
  { "$project" => {
    "_id" => 0,
    "type" => {
      "$cond" => [
        { "$eq" => [ "$splitter", "one" ] },
        "$one.type",
        "$two.type"
      ]
    },
    "pos" => {
      "$cond" => [
        { "$eq" => [ "$splitter", "one" ] },
        "$one.pos",
        "$two.pos"
      ]
    }
  }}
])

pp res

文档中的命名实际上并没有被代码使用,First"、Second"等显示的数据中的标题实际上只是为了说明您确实从中获得了top 2"文档结果是列表.

The naming in the documents is actually not used by the code, and titles in the data shown for "First", "Second" etc, are really just there to illustrate that you are indeed getting the "top 2" documents from the listing as a result.

所以这里的方法本质上是创建一个按您的键分组"的文档堆栈",例如类型".这里的第一件事是使用 $first 运算符.

So the approach here is essentially to create a "stack" of the documents "grouped" by your key, such as "type". The very first thing here is to take the "first" document from that stack using the $first operator.

随后的步骤匹配堆栈中已见"的元素并过滤它们,然后使用 $first 运算符.最后一步实际上只是将文档返回到输入中找到的原始形式,这通常是此类查询所期望的.

The subsequent steps match the "seen" elements from the stack and filter them, then you take the "next" document off of the stack again using the $first operator. The final steps in there are really justx to return the documents to the original form as found in the input, which is generally what is expected from such a query.

所以结果当然是每种类型的前 2 个文档:

So the result is of course, just the top 2 documents for each type:

{ "type"=>"A", "pos"=>"First" }
{ "type"=>"A", "pos"=>"Second" }
{ "type"=>"B", "pos"=>"First" }
{ "type"=>"B", "pos"=>"Second" }

在这个最近的答案中有一个更长的讨论和版本以及其他解决方案:

There was a longer discussion and version of this as well as other solutions in this recent answer:

MongoDB聚合$group,限制数组长度

尽管有标题,但本质上是相同的,并且该案例希望匹配最多 10 个或更多的顶级条目.那里还有一些管道生成代码,用于处理更大的匹配,以及可能根据您的数据考虑的一些替代方法.

Essentially the same thing despite the title and that case was looking to match up to 10 top entries or greater. There is some pipeline generation code there as well for dealing with larger matches as well as some alternate approaches that may be considered depending on your data.

这篇关于MongoDB中的多重限制条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆