Pymongo 聚合:按字段数过滤(动态) [英] Pymongo aggregate: filter by count of fields number (dynamic)

查看:53
本文介绍了Pymongo 聚合:按字段数过滤(动态)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个聚合管道,它现在导致一个包含如下构建的文档的集合:

Let's say I have an aggregation pipeline that for now leads to a collection with documents built like this:

{'name': 'Paul',
 'football_position': 'Keeper',
 'basketball_position': 4,...}

显然不是每个人都参加每项运动,因此对于某些文档,会有不存在的字段.关于他们的文件将是

Obviously not everyone plays every sport so for some documents there would be fields that do not exist. The document regarding them would then be

{'name': 'Louis'}

我想要做的是在我的聚合管道内过滤至少从事一项运动的人

What I want to do is to filter people that play at least one sport, inside my aggregation pipeline

我知道这很容易用 {'$match': {'football_position': {'$exists': True}}} 检查一个字段,但我想检查是否这些字段中的任何一个都存在.

I know that this is easy to check for one field with {'$match': {'football_position': {'$exists': True}}}, but I want to check if any of these fields exist.

我发现了一个有点类似的旧问题(检查MongoDB 文档中存在多个字段) 但它会检查所有 字段的存在 - 虽然麻烦,但可以通过乘法 $match 操作.另外,也许 mongoDB 现在有一种比编写自定义 JavaScript 函数更好的方法来处理这个问题.

I found an old question a bit similar (Check for existence of multiple fields in MongoDB document) but it checks for the existence of all fields -which, while bothersome, could be attained by the multiplication of multiples $match operations. Plus, maybe mongoDB has now a better way to handle this than writing a custom JavaScript function.

推荐答案

也许 mongoDB 现在有更好的方法来处理这个问题

maybe mongoDB has now a better way to handle this

是的,您现在可以使用聚合运算符 $objectToArray (SERVER-23310) 将键转换为值.它应该能够计算字段的动态"数量.将此运算符与 $addFields 结合使用可能会非常有用.

Yes, you can now utilise an aggregation operator $objectToArray (SERVER-23310) to turn keys into values. It should be able to count 'dynamic' number of fields. Combining this operator with $addFields could be quite useful.

这两个运算符在 MongoDB v3.4.4+ 中都可用以上面的文档为例:

Both operators are available in MongoDB v3.4.4+ Using your documents above as example:

db.sports.aggregate([
          { $addFields : 
             { "numFields" : 
               { $size:
                 { $objectToArray:"$$ROOT"}
               }
             }
          }, 
          { $match: 
            { numFields: 
              {$gt:2}
            }
          }
])

上面的聚合管道将首先添加一个名为 numFields 的字段.该值将是数组的大小.该数组将包含文档中的字段数.第二阶段将仅过滤 2 个和更多字段(两个字段,因为仍有 _id 字段加上 name).

The aggregation pipeline above, will first add a field called numFields. The value would be the size of an array. The array would contain the number of fields in the document. The second stage would filter only for 2 fields and greater (two fields because there's still _id field plus name).

PyMongo中,上面的聚合管道看起来像:

In PyMongo, the above aggregation pipeline would look like:

cursor = collection.aggregate([
                         {"$addFields":{"numFields":
                                         {"$size":{"$objectToArray":"$$ROOT"}}}}, 
                         {"$match":{"numFields":{"$gt":2}}}
         ])

综上所述,如果您的用例可能,我建议重新考虑您的 数据模型,以便于访问.即,当插入/添加新的运动位置时,添加一个新字段以跟踪运动数量.

Having said the above, if possible for your use case, I would suggest to reconsider your data models for easier access. i.e. Add a new field to keep track of number of sports when a new sport position is inserted/added.

这篇关于Pymongo 聚合:按字段数过滤(动态)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆