Mongo和Java:为聚合框架创建索引 [英] Mongo and Java: Create indexes for aggregation framework

查看:269
本文介绍了Mongo和Java:为聚合框架创建索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情况:地图缩小(汇总)后,我的馆藏中包含大量文档.集合中的文档如下所示:

Situation: I have collection with huge amount of documents after map reduce(aggregation). Documents in the collection looks like this:

/* 0 */
{
    "_id" : {
        "appId" : ObjectId("1"),
        "timestamp" : ISODate("2014-04-12T00:00:00.000Z"),
        "name" : "GameApp",
        "user" : "test@mail.com",
        "type" : "game"
    },
    "value" : {
        "count" : 2
    }
}

/* 1 */
{
    "_id" : {
        "appId" : ObjectId("2"),
        "timestamp" : ISODate("2014-04-29T00:00:00.000Z"),
        "name" : "ScannerApp",
        "user" : "newUser@company.com",
        "type" : "game"
    },
    "value" : {
        "count" : 5
    }
}

...

然后我使用聚合框架在该集合中进行搜索:

And I searching inside this collection with aggregation framework:

db.myCollection.aggregate([match, project, group, sort, skip, limit]); // aggregation can return result on Daily or Monthly time base depends of user search criteria, with pagination etc...

可能的搜索条件:

1. {appId, timestamp, name, user, type} 
2. {appId, timestamp}
3. {name, user}

我得到正确的结果,正是我所需要的.但是从优化的角度来看,我对索引编制有疑问.

I'm getting correct result, exactly what I need. But from optimisation point of view I have doubts about indexing.

问题:

  1. 是否可以为此类集合创建索引?
  2. 如何使用复杂的 _id 字段为此类对象创建索引?
  3. 如何模拟 db.collection.find().explain()以验证使用了哪个索引?
  4. 将这样的集合或我的表现妄想症索引起来是个好主意吗?
  1. Is it possible to create indexes for such collection?
  2. How I can create indexes for such object with complex _id field?
  3. How I can do analog of db.collection.find().explain() to verify which index used?
  4. And is good idea to index such collection or its my performance paranoia?


答案汇总:

  • MongoDB通过_id字段自动创建索引,但是对于复杂的_id字段(如示例)而言,这是没有用的.对于类似_id: {name: "", timestamp: ""}的字段,您必须使用类似这样的索引:*.ensureIndex({"_id.name": 1, "_id.timestamp": 1}),只有这样,您的集合才会由_id字段以正确的方式进行索引.
  • 要跟踪索引如何与Mongo Aggregation一起使用,您不能使用db.myCollection.aggregate().explain(),正确的做法是:
  • MongoDB creates index by _id field automatically but that is useless in a case of complex _id field like in an example. For field like: _id: {name: "", timestamp: ""} you must use index like that: *.ensureIndex({"_id.name": 1, "_id.timestamp": 1}) only after that your collection will be indexed in proper way by _id field.
  • For tracking how your indexes works with Mongo Aggregation you can not use db.myCollection.aggregate().explain() and proper way of doing that is:

  • 我在本地计算机上进行的测试表明,这样的索引编制似乎是个好主意.但这是需要更多测试的大收藏夹.
    • My testing on local computer sows that such indexing seems to be good idea. But this is require more testing with big collections.
    • 推荐答案

      首先,索引1和3可能值得研究.至于说明,您可以将说明作为选项传递给管道.您可以在此处找到文档,并此处

      First, indexes 1 and 3 are probably worth investigating. As for explain, you can pass explain as an option to your pipeline. You can find docs here and an example here

      这篇关于Mongo和Java:为聚合框架创建索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆