MongoDB的不同与正则表达式查询的数组字段? [英] Mongodb distinct on a array field with regex query?

查看:94
本文介绍了MongoDB的不同与正则表达式查询的数组字段?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上,我正在尝试在模型上实现标签功能.

Basically i'm trying to implement tags functionality on a model.

> db.event.distinct("tags")
[ "bar", "foo", "foobar" ]

进行简单的不同查询将检索所有不同的标签.但是,我将如何获取与某个查询匹配的所有不同标签?比如说我想获取所有匹配foo的标签,然后期望得到["foo","foobar"]的结果?

Doing a simple distinct query retrieves me all distinct tags. However how would i go about getting all distinct tags that match a certain query? Say for example i wanted to get all tags matching foo and then expecting to get ["foo","foobar"] as a result?

以下查询是我为实现此目的而失败的尝试:

The following queries is my failed attempts of achieving this:

> db.event.distinct("tags",/foo/)
[ "bar", "foo", "foobar" ]

> db.event.distinct("tags",{tags: {$regex: 'foo'}})
[ "bar", "foo", "foobar" ]

推荐答案

聚合框架,而不是.distinct()命令:

db.event.aggregate([
    // De-normalize the array content to separate documents
    { "$unwind": "$tags" },

    // Filter the de-normalized content to remove non-matches
    { "$match": { "tags": /foo/ } },

    // Group the "like" terms as the "key"
    { "$group": {
        "_id": "$tags"
    }}
])

您最好在正则表达式的开头使用锚",这是指从字符串的开始"开始.在处理 $match 同样是"http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/" rel ="noreferrer"> $unwind :

You are probably better of using an "anchor" to the beginning of the regex is you mean from the "start" of the string. And also doing this $match before you process $unwind as well:

db.event.aggregate([
    // Match the possible documents. Always the best approach
    { "$match": { "tags": /^foo/ } },

    // De-normalize the array content to separate documents
    { "$unwind": "$tags" },

    // Now "filter" the content to actual matches
    { "$match": { "tags": /^foo/ } },

    // Group the "like" terms as the "key"
    { "$group": {
        "_id": "$tags"
    }}
])

这确保您不会在 $unwind 上进行处理集合中的每个文档,只有那些可能包含匹配的标签"值的文档,然后才能进行过滤"以确认.

That makes sure you are not processing $unwind on every document in the collection and only those that possibly contain your "matched tags" value before you "filter" to make sure.

使用可能的匹配"来缓解大型阵列的真正复杂"方法需要花费更多的工作,而MongoDB 2.6或更高版本:

The really "complex" way to somewhat mitigate large arrays with possible matches takes a bit more work, and MongoDB 2.6 or greater:

db.event.aggregate([
    { "$match": { "tags": /^foo/ } },
    { "$project": {
        "tags": { "$setDifference": [
            { "$map": {
                "input": "$tags",
                "as": "el",
                "in": { "$cond": [
                    { "$eq": [ 
                        { "$substr": [ "$$el", 0, 3 ] },
                        "foo"
                    ]},
                    "$$el",
                    false
                ]}
            }},
            [false]
        ]}
    }},
    { "$unwind": "$tags" },
    { "$group": { "_id": "$tags" }}
])

所以 $map 是一个不错的数组在线"处理器,但只能走这么远. $setDifference 运算符将否定false匹配,但是最终您仍然需要处理$unwind来完成其余的$group阶段,以总体上获得不同的值.

So $map is a nice "in-line" processor of arrays but it can only go so far. The $setDifference operator negates the false matches, but ultimately you still need to process $unwind to do the remaining $group stage for distinct values overall.

这里的优点是现在将数组简化"为仅匹配的"tags"元素.当同一文档中存在多个不同"值时,如果您希望计数"出现次数,请不要使用此选项.但同样,还有其他方法可以解决这个问题.

The advantage here is that arrays are now "reduced" to only the "tags" element that matches. Just don't use this when you want a "count" of the occurrences when there are "multiple distinct" values in the same document. But again, there are other ways to handle that.

这篇关于MongoDB的不同与正则表达式查询的数组字段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆