在MongoDB中索引对象数组 [英] Indexing arrays of objects in MongoDB

查看:70
本文介绍了在MongoDB中索引对象数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的电子邮件转储,试图在MongoDB中存储和查询.有160万封电子邮件,每封电子邮件都作为 Node模块的输出存储,该模块将原始电子邮件解析为漂亮的Javascript对象,如下所示:

I have a huge email dump that I am trying to store and query in MongoDB. There are 1.6M emails, each of which is stored as the output from a Node module that parses raw emails into nice Javascript objects, like so:

{
    "text" : "This is the text of my email",
    "subject" : "Great opportunity",
    "from" : [ 
        {
            "address" : "chris.wilson@example.com",
            "name" : "Chris Wilson"
        }
    ],
    "to" : [ 
        {
            "address" : "person.a@example.com",
            "name" : "Person A"
        }, 
        {
            "address" : "person.b@example.com",
            "name" : "Person B"
        }, 
        {
            "address" : "person.c@example.com",
            "name" : "Person C"
        }
    ],
    "date" : ISODate("2015-01-05T21:38:55.000Z")
}

我需要能够高效地查找发送给person.a@gmail.com的所有电子邮件"或克里斯·威尔逊发送的每封电子邮件"之类的内容(无论该名称附加了哪个电子邮件地址)

I need to be able to efficiently look up things like "All emails sent to person.a@gmail.com" or "Every email sent by 'Chris Wilson'" (regardless of which email address is attached to that name).

Mongo非常愿意为我为"to"和"from"查询建立索引,但是我不确定执行此操作时该查询是否有效:

Mongo is perfectly willing to index the "to" and "from" queries for me, but I'm not certain that the query works when I do this:

db.emails.find({ "to.name": "Person A" })

这是一个覆盖查询,用于在作为键值对象数组的字段中查找特定属性的特定值吗?对于我来说,此查询的运行速度非常慢,但又又是一个很大的语料库.

Is this a covered query, to look for a specific value of a specific property in a field that is an array of key-value objects? This queries are running VERY slow for me, but then again it is a large corpus.

更新

以下是在上述查询中附加".explain"的输出:

Here's the output of appending ".explain" to the above query:

{
    "cursor" : "BasicCursor",
    "isMultiKey" : false,
    "n" : 24,
    "nscannedObjects" : 1646837,
    "nscanned" : 1646837,
    "nscannedObjectsAllPlans" : 1646837,
    "nscannedAllPlans" : 1646837,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 17088,
    "nChunkSkips" : 0,
    "millis" : 84685,
    "server" : "DCA-TM-GUEST-iMac.local:27017",
    "filterSet" : false
}

推荐答案

完全可以,是的.不过,您需要在 to.name 上建立索引,以使该查询高效.它当前使用 BasicCursor 的事实表明没有索引,或者没有使用索引-这很奇怪.作为参考,它们被称为"多键'.

That's perfectly fine, yes. You'd need an index on to.name to make that query efficient, though. The fact that it currently uses a BasicCursor indicates that there's no index, or the index isn't used - which is rather odd. For reference, these are called 'multikeys'.

这是一个涵盖的查询[...]

Is this a covered query [...]

我想您的意思是被MongoDB覆盖了此功能"的已覆盖"?覆盖的查询"是一个查询词,可以单独使用索引来回答.仅当您要返回的所有字段都是索引的一部分时,查询才能被索引覆盖(例如,给我ID,仅发送给John Doe的电子邮件ID),但这在很大程度上没有意义.我想在这种情况下.而且,可悲的是,进入文档时还不支持 .

I guess you mean 'covered' in the sense of "is this functionality covered by MongoDB"? 'Covered query' is a term used for queries that can be answered using the index alone. A query can be covered by indexes only if all the fields you want returned are part of the index (e.g. give me the ids, and only the ids of emails that were sent to John Doe), but that wouldn't make much sense in this context I guess. Also, sadly, it's not supported when reaching into documents yet.

这篇关于在MongoDB中索引对象数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆