迁移后Mongodb数据文件变小 [英] Mongodb data files become smaller after migration

查看:570
本文介绍了迁移后Mongodb数据文件变小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在第一台服务器上,我得到:

On my first server I get:

root@prod ~ # du -hs /var/lib/mongodb/
909G    /var/lib/mongodb/

使用mongodump/mongorestore迁移数据库后 在第二台服务器上,我得到:

After migration this database with mongodump/mongorestore On my second server I get:

root@prod ~ # du -hs /var/lib/mongodb/
30G /var/lib/mongodb/

等了几个小时后,mongo完成了对索引的编制:

After I waited a few hours, mongo finished indexing I got:

root@prod ~ # du -hs /var/lib/mongodb/
54G /var/lib/mongodb/

我测试了数据库,没有损坏或丢失的数据.

I tested database and there's no corrupted or missed data.

为什么迁移前后的大小差异如此之大?

Why there's so big difference in size before and after migration?

推荐答案

由于数据删除以及其他原因,实际数据大小减小时,MongoDB无法恢复磁盘空间.在线文档中有一个不错的解释:

MongoDB does not recover disk space when actually data size drops due to data deletion along with other causes. There's a decent explanation in the online docs:

为什么我的数据目录中的文件大于数据库中的数据?

数据目录(即/data/db目录)中的数据文件 在默认配置下,可能大于插入的数据集 进入数据库.考虑以下可能的原因:

The data files in your data directory, which is the /data/db directory in default configurations, might be larger than the data set inserted into the database. Consider the following possible causes:

预分配的数据文件.

在数据目录中,MongoDB将数据文件预先分配给特定的 大小,部分是为了防止文件系统碎片化. MongoDB命名为 第一个数据文件.0,下一个.1,依此类推. mongod分配的第一个文件是64兆字节,接下来的128兆字节, 依此类推,最大2 GB,这时所有后续文件 2 GB.数据文件包括具有已分配空间的文件,但 没有数据mongod可能会分配一个1 GB的数据文件,该文件可能会 90%为空.对于大多数大型数据库,未使用的已分配空间为 比数据库要小.

In the data directory, MongoDB preallocates data files to a particular size, in part to prevent file system fragmentation. MongoDB names the first data file .0, the next .1, etc. The first file mongod allocates is 64 megabytes, the next 128 megabytes, and so on, up to 2 gigabytes, at which point all subsequent files are 2 gigabytes. The data files include files with allocated space but that hold no data. mongod may allocate a 1 gigabyte data file that may be 90% empty. For most larger databases, unused allocated space is small compared to the database.

在类似Unix的系统上,mongod会预先分配一个额外的数据文件,并 将磁盘空间初始化为0. 新的数据库文件被删除时,后台可以防止明显的延迟 下次分配.

On Unix-like systems, mongod preallocates an additional data file and initializes the disk space to 0. Preallocating data files in the background prevents significant delays when a new database file is next allocated.

您可以通过将preallocDataFiles设置为false来禁用预分配. 但是,请勿在生产环境中禁用preallocDataFiles: 仅使用preallocDataFiles进行测试,并使用小型数据集,其中 您经常删除数据库.

You can disable preallocation by setting preallocDataFiles to false. However do not disable preallocDataFiles for production environments: only use preallocDataFiles for testing and with small data sets where you frequently drop databases.

在Linux系统上,您可以使用hdparm了解成本如何 分配可能是:

On Linux systems you can use hdparm to get an idea of how costly allocation might be:

time hdparm --fallocate $((1024 * 1024))测试文件

time hdparm --fallocate $((1024*1024)) testfile

操作日志.

如果此mongod是副本集的成员,则数据目录 包含oplog.rs文件,该文件是一个预先分配的上限集合 在本地数据库中.默认分配约为5% 64位安装上的磁盘空间,有关更多信息,请参阅Oplog大小调整 信息.在大多数情况下,您无需调整操作日志的大小. 但是,如果这样做,请参阅更改Oplog的大小.

If this mongod is a member of a replica set, the data directory includes the oplog.rs file, which is a preallocated capped collection in the local database. The default allocation is approximately 5% of disk space on 64-bit installations, see Oplog Sizing for more information. In most cases, you should not need to resize the oplog. However, if you do, see Change the Size of the Oplog.

日记.

数据目录包含日记文件,该日记文件存储写操作 在将MongoDB应用到数据库之前,先对磁盘进行操作.看 日记力学.

The data directory contains the journal files, which store write operations on disk prior to MongoDB applying them to databases. See Journaling Mechanics.

空记录.

在删除时,MongoDB维护数据文件中的空记录列表 文件和收藏. MongoDB可以重用此空间,但是会 永远不要将此空间返回给操作系统.

MongoDB maintains lists of empty records in data files when deleting documents and collections. MongoDB can reuse this space, but will never return this space to the operating system.

要对分配的存储进行碎片整理,请使用紧凑型 分配的空间.通过对存储进行碎片整理,MongoDB可以有效地 使用分配的空间.紧凑型需要多达2 GB的额外空间 要运行的磁盘空间.如果您的电池电量严重不足,请不要使用紧凑型 磁盘空间.

To de-fragment allocated storage, use compact, which de-fragments allocated space. By de-fragmenting storage, MongoDB can effectively use the allocated space. compact requires up to 2 gigabytes of extra disk space to run. Do not use compact if you are critically low on disk space.

重要

compact仅从MongoDB数据文件中删除碎片,并且执行 不会将任何磁盘空间返回给操作系统.

compact only removes fragmentation from MongoDB data files and does not return any disk space to the operating system.

要回收已删除的空间,请使用repairDatabase,它会重建 碎片化存储空间并可能释放空间的数据库 操作系统. repairDatabase最多需要2 GB的额外空间 要运行的磁盘空间.如果严重不足,请不要使用repairDatabase 在磁盘空间上.

To reclaim deleted space, use repairDatabase, which rebuilds the database which de-fragments the storage and may release space to the operating system. repairDatabase requires up to 2 gigabytes of extra disk space to run. Do not use repairDatabase if you are critically low on disk space.

http://docs.mongodb.org/manual/faq/storage/

他们没有告诉您的还有两种还原/恢复磁盘空间的方法-与您一样的mongodump/mongorestore或向具有空磁盘的副本集添加新成员,以便从头开始写入databsae文件

What they don't tell you are the two other ways to restore/recover disk space - mongodump/mongorestore as you did or adding a new member to the replica set with an empty disk so that it writes it's databsae files from scratch.

如果您有兴趣进行监视,则db.stats()命令返回有关数据,索引,存储和文件大小的大量数据:

If you are interested in monitoring this, the db.stats() command returns a wealth of data on data, index, storage and file sizes:

http://docs.mongodb.org/manual/reference/command/dbStats/

这篇关于迁移后Mongodb数据文件变小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆