Mongo和Java:为聚合框架创建索引 [英] Mongo and Java: Create indexes for aggregation framework
本文介绍了Mongo和Java:为聚合框架创建索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
情况:地图缩小(汇总)后,我的馆藏中包含大量文档.集合中的文档如下所示:
Situation: I have collection with huge amount of documents after map reduce(aggregation). Documents in the collection looks like this:
/* 0 */
{
"_id" : {
"appId" : ObjectId("1"),
"timestamp" : ISODate("2014-04-12T00:00:00.000Z"),
"name" : "GameApp",
"user" : "test@mail.com",
"type" : "game"
},
"value" : {
"count" : 2
}
}
/* 1 */
{
"_id" : {
"appId" : ObjectId("2"),
"timestamp" : ISODate("2014-04-29T00:00:00.000Z"),
"name" : "ScannerApp",
"user" : "newUser@company.com",
"type" : "game"
},
"value" : {
"count" : 5
}
}
...
然后我使用聚合框架在该集合中进行搜索:
And I searching inside this collection with aggregation framework:
db.myCollection.aggregate([match, project, group, sort, skip, limit]); // aggregation can return result on Daily or Monthly time base depends of user search criteria, with pagination etc...
可能的搜索条件:
1. {appId, timestamp, name, user, type}
2. {appId, timestamp}
3. {name, user}
我得到正确的结果,正是我所需要的.但是从优化的角度来看,我对索引编制有疑问.
I'm getting correct result, exactly what I need. But from optimisation point of view I have doubts about indexing.
问题:
- 是否可以为此类集合创建索引?
- 如何使用复杂的 _id 字段为此类对象创建索引?
- 如何模拟 db.collection.find().explain()以验证使用了哪个索引?
- 将这样的集合或我的表现妄想症索引起来是个好主意吗?
- Is it possible to create indexes for such collection?
- How I can create indexes for such object with complex _id field?
- How I can do analog of db.collection.find().explain() to verify which index used?
- And is good idea to index such collection or its my performance paranoia?
答案汇总:
- MongoDB通过
_id
字段自动创建索引,但是对于复杂的_id
字段(如示例)而言,这是没有用的.对于类似_id: {name: "", timestamp: ""}
的字段,您必须使用类似这样的索引:*.ensureIndex({"_id.name": 1, "_id.timestamp": 1})
,只有这样,您的集合才会由_id
字段以正确的方式进行索引. - 要跟踪索引如何与Mongo Aggregation一起使用,您不能使用
db.myCollection.aggregate().explain()
,正确的做法是:
- MongoDB creates index by
_id
field automatically but that is useless in a case of complex_id
field like in an example. For field like:_id: {name: "", timestamp: ""}
you must use index like that:*.ensureIndex({"_id.name": 1, "_id.timestamp": 1})
only after that your collection will be indexed in proper way by_id
field. - For tracking how your indexes works with Mongo Aggregation you can not use
db.myCollection.aggregate().explain()
and proper way of doing that is:
- 我在本地计算机上进行的测试表明,这样的索引编制似乎是个好主意.但这是需要更多测试的大收藏夹.
- My testing on local computer sows that such indexing seems to be good idea. But this is require more testing with big collections.
推荐答案
首先,索引1和3可能值得研究.至于说明,您可以将说明作为选项传递给管道.您可以在此处找到文档,并此处
First, indexes 1 and 3 are probably worth investigating. As for explain, you can pass explain as an option to your pipeline. You can find docs here and an example here
这篇关于Mongo和Java:为聚合框架创建索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文