使用索引的Mongodb MapReduce性能 [英] Mongodb MapReduce performance using Indexes
问题描述
我在mongodb中有一个示例文档(并且对mongodb还是陌生的)
I have a sample document in mongodb(and I am still new to mongodb)
{
"ID": 0,
"Facet1":"Value1",
"Facet2":[
{
"Facet2Obj1":{
"Obj1Facet1":"Value11",
"Obj2Facet1":"Value21",
"Obj3Facet1":"Value31"
}
},
{
"Facet2Obj2":{
"Obj1Facet2":"Value12",
"Obj2Facet2":"Value22",
"Obj3Facet2":"Value32"
}
},
{
"Facet2Obj3":{
"Obj1Facet3":"Value13",
"Obj2Facet3":"Value23",
"Obj3Facet3":"Value33"
}
}
],
"Facet3":"Value3"
"Facet4":{
"Facet4Obj1":{
"Obj1Facet1":"Value4111"
}
}
}
Mapreduce有点复杂,它提供以下输出(用于30,000个文档):
The Mapreduce is a little bit complex and it gives the following ouput(for 30,000 documents):
{
"_id" : "Facet1",
"value" : [
{
"value" : "Value1",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
}
]
}
{
"_id" : "ID",
"value" : [
{
"value" : 0,
"count" : 1,
"ID" : [
0
]
},
{
"value" : 1,
"count" : 1,
"ID" : [
1
]
},
.
.
.
]
}
{
"_id" : "Facet2",
"value" : [
{
"value" : "Facet2Obj1",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
},
{
"value" : "Facet2Obj2",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
},
{
"value" : "Facet2Obj3",
"count" : 30000,
"ID" : [
0,
1,
.
.
.
]
}
]
}
{
"_id" : "Facet3",
"value" : [
{
"value" : "Value3",
"count" : 30000,
"ID" : [
0,
1,
2,
.
.
.
]
}
]
}
{
"_id" : "Facet4",
"value" : [
{
"value" : "Facet4Obj1",
"count" : 30000,
"ID" : [
0,
1,
2,
.
.
.
]
}
]
}
我在mongodb中插入了30,000个使用不同ID格式的文档,然后进行了map-reduce,但是速度很慢.如果有30,000个文档,则大约需要30分钟,但随后我将刻面的索引变快了一点,就像需要350秒,但是有了50,000个文档,则又花了大约30分钟.当我使用db.collection.getIndexes()
mongodb检查索引时,将返回以下输出:
I inserted 30,000 documents using the format(with different IDs) into the mongodb, Then I did a map-reduce,but it was slow. With 30,000 documents it will take about 30 minutes , but then I put indexes with the facets it became faster a little bit, like it would take 350 seconds but with 50,000 documents it took again about 30 minutes. When I check the indexes using db.collection.getIndexes()
mongodb will return this output:
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "database.collection",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"ID" : 1,
"Facet1" : 1,
"Facet2" : 1,
"Facet3" : 1,
"Facet4" : 1
},
"ns" : "database.collection",
"name" : "ID_1_Facet1_1_Facet2_1_Facet3_1_Facet4_1"
}
索引有什么我做错的地方吗,因为索引必须从战略上放置,否则性能输出将与之相反,因此map-reduce仍然不够快
Is there anything I did wrong with the indexes that the map-reduce is still not fast enough because Indexes must be strategically place or performance output will be the opposite
非常感谢您,并在此先感谢
Answers are greatly appreciated and thanks in advance
推荐答案
MapReduce将集合中的每个文档传递给地图函数除外,如果您将其传递给{query:}选项,它将用于将预"过滤器文档发送到MapReduce.您还可以向mapReduce传递一个{sort:}选项,它将把文档发送到按该字段排序的map函数.
MapReduce passes every document in a collection into the map function except if you pass it {query: } option which it will use to "pre"-filter documents sent to MapReduce. You can also pass a {sort:} option to mapReduce and it will send documents to map function sorted on that field(s).
这是仅有的两个将使用索引的地方-之后,所有事情都会在为该工作产生的Javascript线程中发生.
That's the only two places where indexes will be used - after that everything happens in the Javascript thread that's spawned for the work.
这篇关于使用索引的Mongodb MapReduce性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!