使用索引的Mongodb MapReduce性能 [英] Mongodb MapReduce performance using Indexes

查看:146
本文介绍了使用索引的Mongodb MapReduce性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在mongodb中有一个示例文档(并且对mongodb还是陌生的)

I have a sample document in mongodb(and I am still new to mongodb)

{
    "ID": 0,
    "Facet1":"Value1",
    "Facet2":[
        {
            "Facet2Obj1":{
                "Obj1Facet1":"Value11",
                "Obj2Facet1":"Value21",
                "Obj3Facet1":"Value31"
            }   
        },
        {
            "Facet2Obj2":{
                "Obj1Facet2":"Value12",
                "Obj2Facet2":"Value22",
                "Obj3Facet2":"Value32"
            }
        },
        {
            "Facet2Obj3":{
                "Obj1Facet3":"Value13",
                "Obj2Facet3":"Value23",
                "Obj3Facet3":"Value33"
            }
        }
    ],
    "Facet3":"Value3"
    "Facet4":{
        "Facet4Obj1":{
            "Obj1Facet1":"Value4111"
        }
    }
}

Mapreduce有点复杂,它提供以下输出(用于30,000个文档):

The Mapreduce is a little bit complex and it gives the following ouput(for 30,000 documents):

{
    "_id" : "Facet1",
    "value" : [
        {
            "value" : "Value1",
            "count" : 30000,
            "ID" : [
                0,
                1,
            .
                .
                .
            ]
        }
    ]
}
{
    "_id" : "ID",
    "value" : [
        {
            "value" : 0,
            "count" : 1,
            "ID" : [
                0
            ]
        },
        {
            "value" : 1,
            "count" : 1,
            "ID" : [
                1
            ]
        },
        .
        .
        .
    ]
}
{
    "_id" : "Facet2",
    "value" : [
        {
            "value" : "Facet2Obj1",
            "count" : 30000,
            "ID" : [
                0,
                1,
                .
                .
                .
            ]
        },
        {
            "value" : "Facet2Obj2",
            "count" : 30000,
            "ID" : [
                0,
                1,
                .
                .
                .
            ]
        },
        {
            "value" : "Facet2Obj3",
            "count" : 30000,
            "ID" : [
                0,
                1,
                .
                .
                .
            ]
        }
    ]
}
{
    "_id" : "Facet3",
    "value" : [
    {
            "value" : "Value3",
        "count" : 30000,
            "ID" : [
                0,
                1,
                2,
                .
                .
                .
            ]
        }
    ]
} 
{
    "_id" : "Facet4",
    "value" : [
        {
            "value" : "Facet4Obj1",
            "count" : 30000,
            "ID" : [
                0,
                1,
                2,
                .
                .
                .
            ]
        }
    ]
}

我在mongodb中插入了30,000个使用不同ID格式的文档,然后进行了map-reduce,但是速度很慢.如果有30,000个文档,则大约需要30分钟,但随后我将刻面的索引变快了一点,就像需要350秒,但是有了50,000个文档,则又花了大约30分钟.当我使用db.collection.getIndexes() mongodb检查索引时,将返回以下输出:

I inserted 30,000 documents using the format(with different IDs) into the mongodb, Then I did a map-reduce,but it was slow. With 30,000 documents it will take about 30 minutes , but then I put indexes with the facets it became faster a little bit, like it would take 350 seconds but with 50,000 documents it took again about 30 minutes. When I check the indexes using db.collection.getIndexes() mongodb will return this output:

{
    "v" : 1,
    "key" : {
        "_id" : 1
    },
    "ns" : "database.collection",
    "name" : "_id_"
},
{
    "v" : 1,
    "key" : {
        "ID" : 1,
        "Facet1" : 1,
        "Facet2" : 1,
        "Facet3" : 1,
        "Facet4" : 1
    },
    "ns" : "database.collection",
    "name" : "ID_1_Facet1_1_Facet2_1_Facet3_1_Facet4_1"
}

索引有什么我做错的地方吗,因为索引必须从战略上放置,否则性能输出将与之相反,因此map-reduce仍然不够快

Is there anything I did wrong with the indexes that the map-reduce is still not fast enough because Indexes must be strategically place or performance output will be the opposite

非常感谢您,并在此先感谢

Answers are greatly appreciated and thanks in advance

推荐答案

MapReduce将集合中的每个文档传递给地图函数除外,如果您将其传递给{query:}选项,它将用于将预"过滤器文档发送到MapReduce.您还可以向mapReduce传递一个{sort:}选项,它将把文档发送到按该字段排序的map函数.

MapReduce passes every document in a collection into the map function except if you pass it {query: } option which it will use to "pre"-filter documents sent to MapReduce. You can also pass a {sort:} option to mapReduce and it will send documents to map function sorted on that field(s).

这是仅有的两个将使用索引的地方-之后,所有事情都会在为该工作产生的Javascript线程中发生.

That's the only two places where indexes will be used - after that everything happens in the Javascript thread that's spawned for the work.

这篇关于使用索引的Mongodb MapReduce性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆