为什么以及何时需要在MongoDB中重建索引? [英] Why and when is necessary to rebuild indexes in MongoDB?

查看:835
本文介绍了为什么以及何时需要在MongoDB中重建索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

已经和MongoDB合作了一段时间,今天我在与同事讨论时遇到了疑问。

Been working with MongoDB for a while and today I had a doubt while discussing with a colleague.

当你在MongoDB中创建一个索引时,处理集合并构建索引。

The thing is that when you create an index in MongoDB, the collection is processed and the index is built.

索引在文档的插入和删除过程中更新,所以我真的不需要运行重建索引操作(删除索引然后重建它) 。

The index is updated within insertion and deletion of documents so I don't really see the need to run a rebuild index operation (which drops the index and then rebuild it).

根据MongoDB文档:

According to MongoDB documentation:


通常,MongoDB在例行更新期间压缩索引。对于大多数
用户,不需要reIndex命令。但是,如果集合大小发生了显着变化,或者
索引消耗了不成比例的磁盘空间,那么运行它可能值

Normally, MongoDB compacts indexes during routine updates. For most users, the reIndex command is unnecessary. However, it may be worth running if the collection size has changed significantly or if the indexes are consuming a disproportionate amount of disk space.

是否有人需要运行值得的重建索引操作?

Does someone has had the need of running a rebuild index operation that worth it?

推荐答案

As根据MongoDB文档,通常不需要定期重建索引。

As per the MongoDB documentation, there is generally no need to routinely rebuild indexes.

注意:MongoDB 3.0+对存储的任何建议都会变得更有趣,它引入了可插拔存储引擎API 。我在下面的评论是专门参考MongoDB 3.0及更早版本中的默认MMAP存储引擎。 WiredTiger和其他存储引擎具有不同的数据存储实现方式。索引。

NOTE: Any advice on storage becomes more interesting with MongoDB 3.0+, which introduced a pluggable storage engine API. My comments below are specifically in reference to the default MMAP storage engine in MongoDB 3.0 and earlier. WiredTiger and other storage engines have different storage implementations for data & indexes.

在以下情况下使用MMAP存储引擎重建索引可能会有一些好处:

There may be some benefit in rebuilding an index with the MMAP storage engine if:


  • 与数据相比,索引占用的空间量大于预期。注意:您需要监控历史数据和索引大小,以便进行比较。

  • An index is consuming a larger than expected amount of space compared to the data. Note: you need to monitor historical data & index size to have a baseline for comparison.

您希望从较旧的索引格式迁移到较新的索引格式。如果建议使用reindex,则会在升级说明中提及。例如,MongoDB 2.0引入了重要的索引性能改进所以发行说明包括升级后建议的v2.0格式重新索引。同样,MongoDB 2.6引入了 2dsphere (v2.0)索引具有不同的默认行为(默认为稀疏)。索引版本升级后不会重建现有索引;是否/何时升级的选择由数据库管理员决定。

You want to migrate from an older index format to a newer one. If a reindex is advisible this will be mentioned in the upgrade notes. For example, MongoDB 2.0 introduced significant index performance improvements so the release notes include a suggested reindex to the v2.0 format after upgrading. Similarly, MongoDB 2.6 introduced 2dsphere (v2.0) indexes which have a different default behaviour (sparse by default). Existing indexes are not rebuilt after index version upgrades; the choice of if/when to upgrade is left to the database administrator.

您已更改 _id 将单调增加的密钥(例如,ObjectID)或从单调增加的密钥(例如,ObjectID)的集合格式化为随机值。这有点深奥,但如果你要插入总是在增加的 _id ,那么有一个索引优化可以将b-tree桶分成90/10(而不是50/50) (参考: SERVER-983 )。如果 _id 的性质发生显着变化,则可以使用重新索引构建更高效的b树。

You have changed the _id format for a collection to or from a monotonically increasing key (eg. ObjectID) to a random value. This is a bit esoteric, but there's an index optimisation that splits b-tree buckets 90/10 (instead of 50/50) if you are inserting _ids that are always increasing (ref: SERVER-983). If the nature of your _ids changes significantly, it may be possible to build a more efficient b-tree with a re-index.

有关一般B树行为的更多信息,请参阅:维基百科:B树

For more information on general B-tree behaviour, see: Wikipedia: B-tree

如果你真的很想要深入研究索引内部,你可以尝试一些实验性的命令/工具。我希望这些仅限于MongoDB 2.4&仅限2.6:

If you're really curious to dig into the index internals a bit more, there are some experimental commands/tools you can try. I expect these are limited to MongoDB 2.4 & 2.6 only:

  • indexStats command
  • storage-viz tool

这篇关于为什么以及何时需要在MongoDB中重建索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆