MongoDB/Mongoose 索引使查询变快还是变慢? [英] MongoDB/Mongoose index make query faster or slow it down?

查看:81
本文介绍了MongoDB/Mongoose 索引使查询变快还是变慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的文章模型:

I have an article model like this:

var ArticleSchema = new Schema({

    type: String
    ,title: String
    ,content: String
    ,hashtags: [String]

    ,comments: [{
        type: Schema.ObjectId
        ,ref: 'Comment'
    }]

    ,replies: [{
        type: Schema.ObjectId
        ,ref: 'Reply'
    }]

    , status: String
    ,statusMeta: {
        createdBy: {
            type: Schema.ObjectId
            ,ref: 'User'
        }
        ,createdDate: Date
        , updatedBy: {
            type: Schema.ObjectId
            ,ref: 'User'
        }
        ,updatedDate: Date

        ,deletedBy: {
            type: Schema.ObjectId,
            ref: 'User'
        }
        ,deletedDate: Date

        ,undeletedBy: {
            type: Schema.ObjectId,
            ref: 'User'
        }
        ,undeletedDate: Date

        ,bannedBy: {
            type: Schema.ObjectId,
            ref: 'User'
        }
        ,bannedDate: Date
        ,unbannedBy: {
            type: Schema.ObjectId,
            ref: 'User'
        }

        ,unbannedDate: Date
    }
}, {minimize: false})

当用户创建或修改文章时,我会创建hashtags

When user creates or modify the article, I will create hashtags

ArticleSchema.pre('save', true, function(next, done) {
    var self = this
    if (self.isModified('content')) {
        self.hashtags = helper.listHashtagsInText(self.content)
    }
    done()
    return next()
})

例如,如果用户写 "Hi, #greeting, i love #friday",我会将 ['greeting', 'friday'] 存储在主题标签列表中.

For example, if user write "Hi, #greeting, i love #friday", I will store ['greeting', 'friday'] in hashtags list.

我正在考虑为主题标签创建索引,以便更快地查询主题标签.但是从猫鼬手册中,我发现了这一点:

I am think about creating an index for hashtags to make queries on hashtags faster. But from mongoose manual, I found this:

当您的应用程序启动时,Mongoose 会自动调用确保架构中每个定义的索引的索引.猫鼬会打电话依次确保每个索引的索引,并在上发出索引"事件当所有 ensureIndex 调用成功或有一个错误.虽然有利于开发,但建议使用此行为在生产中被禁用,因为索引创建会导致显着的性能影响.通过设置 autoIndex 禁用该行为将您的架构选项设置为 false.

When your application starts up, Mongoose automatically calls ensureIndex for each defined index in your schema. Mongoose will call ensureIndex for each index sequentially, and emit an 'index' event on the model when all the ensureIndex calls succeeded or when there was an error. While nice for development, it is recommended this behavior be disabled in production since index creation can cause a significant performance impact. Disable the behavior by setting the autoIndex option of your schema to false.

http://mongoosejs.com/docs/guide.html

那么 mongoDB/Mongoose 的索引是更快还是更慢?

So is indexing faster or slower for mongoDB/Mongoose?

另外,即使我创建索引

  hashtags: { type: [String], index: true }

如何在查询中使用索引?或者它会神奇地变得更快,例如:

How can I make use of the index in my query? Or will it just magically become faster for normal queries like:

   Article.find({hashtags: 'friday'})

推荐答案

你看错了

您误解了引用块的意图,即.ensureIndex()(现已弃用,但仍由 mongoose 代码调用)在上下文中确实如此.

You are reading it wrong

You are misreading the intent of the quoted block there as to what .ensureIndex() ( now deprecated, but still called by mongoose code ) actually does here in the context.

在 mongoose 中,您可以在架构或模型级别定义适合您设计的索引.猫鼬自动"为您做的是在连接时检查每个注册的模型,然后为提供的索引定义调用适当的 .ensureIndex() 方法.

In mongoose, you define an index either at the schema or model level as is appropriate to your design. What mongoose "automatically" does for you is on connection it inpects each registered model and then calls the appropriate .ensureIndex() methods for the index definitions provided.

这实际上有什么作用?

嗯,在大多数情况下,在您之前已经启动了您的应用程序并且 .ensureIndexes() 方法被运行绝对没有.这有点夸大其词,但或多或​​少是正确的.

Well, in most cases, being after you have already started up your application before and the .ensureIndexes() method was run is Absolutely Nothing. That is a bit of an overstatement, but it more or less rings true.

因为已经在服务器集合上创建了索引定义,所以后续调用不会做任何事情.即,它不会删除索引并重新创建".因此,一旦创建了索引本身,实际成本基本上就没有了.

Because the index definition has already been created on the server collection, a subsesquent call does not do anything. I.e, it does not drop the index and "re-create". So the real cost is basically nothing, once the index itself has been created.

因此,由于 mongoose 只是标准 API 之上的一层,createIndex() 方法包含正在发生的所有细节.

So since mongoose is just a layer on top of the standard API, the createIndex() method contains all the details of what is happening.

这里有一些细节需要考虑,例如索引构建可以在后台"发生,虽然这对您的应用程序的干扰较小,但它确实需要付出代价.值得注意的是,后台"生成的索引大小会比在前台构建时更大,从而阻塞其他操作.

There are some details to consider here, such as that an index build can happen in the "background", and while this is less intrusive to your application it does come at it's own cost. Notably that the index size from "background" generation will be larger than if you built it n the foreground, blocking other operations.

此外,所有索引都是有代价的,特别是在磁盘使用方面以及在集合数据本身之外写入附加信息的额外成本.

Also all indexes come at a cost, notably in terms of disk usage as well as an additional cost of writing the additional information outside of the collection data itself.

索引的优点是搜索"索引中包含的值比搜索整个集合并匹配可能的条件要快得多.

The adavantages of an index are that it is much faster to "search" for values contained within an index than to seek through the whole collection and match the possible conditions.

这些是与索引相关的基本权衡".

These are the basic "trade-offs" associated with indexes.

回到文档中的引用块,这个建议背后有一个真正的意图.

Back to the quoted block from the documentation, there is a real intent behind this advice.

在部署模式中,尤其是在数据迁移中,按照以下顺序进行操作是很典型的:

It is typical in deployment patterns and particularly with data migrations to do things in this order:

  1. 将数据填充到相关集合/表
  2. 对与您的需求相关的集合/表数据启用索引

这是因为索引创建涉及成本,并且如前所述,希望从索引构建中获得最佳大小,并避免每个文档插入也有写入索引条目的开销当您批量执行此加载"时.

This is because there is a cost involved with index creation, and as mentioned earlier it is desirable to get the most optimum size from the index build, as well as avoid having each document insertion also having the overhead of writing an index entry when you are doing this "load" in bulk.

这就是索引的用途,这些是成本和收益,并且解释了猫鼬文档中的信息.

So that is what indexes are for, those are the costs and benefits and the message in the mongoose documentation is explained.

总的来说,我建议阅读数据库索引,了解它们是什么以及它们是什么做.想想走进图书馆找一本书.入口处有卡片索引.你会在图书馆里走来走去寻找你想要的书吗?还是在卡片索引中查找它以找到它的位置?该索引花费了一些时间来创建并保持更新,但它为您"节省了在整个图书馆走动的时间,以便您可以找到您的书.

In general though, I suggest reading up on Database Indexes for what they are and what they do. Think of walking into a library to find a book. There is a card index there at the entrance. Do you walk around the library to find the book you want? Or do you look it up in the card index to find where it is? That index took someone time to create and also keep it updated, but it saves "you" the time of walking around the whole library just so you can find your book.

这篇关于MongoDB/Mongoose 索引使查询变快还是变慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆