MongoDB 中大型集合的批量插入性能 [英] Bulk insert performance in MongoDB for large collections

查看:83
本文介绍了MongoDB 中大型集合的批量插入性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 BulkWriteOperation(java 驱动程序)以大块的形式存储数据.起初它似乎工作得很好,但是当集合变大时,插入可能会花费很多时间.

I'm using the BulkWriteOperation (java driver) to store data in large chunks. At first it seems to be working fine, but when the collection grows in size, the inserts can take quite a lot of time.

目前对于 20M 文档的集合,批量插入 1000 个文档可能需要大约 10 秒.

Currently for a collection of 20M documents, bulk insert of 1000 documents could take about 10 seconds.

有没有办法让插入独立于集合大小?我没有任何更新或更新,我插入的始终是新数据.

Is there a way to make inserts independent of collection size? I don't have any updates or upserts, it's always new data I'm inserting.

从日志来看,锁似乎没有任何问题.每个文档都有一个索引的时间字段,但它是线性增长的,所以我认为 mongo 不需要花时间重新组织索引.

Judging from the log, there doesn't seem to be any issue with locks. Each document has a time field which is indexed, but it's linearly growing so I don't see any need for mongo to take the time to reorganize the indexes.

我很想听听一些提高性能的想法

I'd love to hear some ideas for improving the performance

谢谢

推荐答案

您认为索引不需要任何文档重组 和您描述索引的方式表明 右手索引 没问题.因此,索引似乎被排除在外.当然,您可以 - 如上所述 - 通过删除索引并重新运行批量写入来明确排除这种情况.

You believe that the indexing does not require any document reorganisation and the way you described the index suggests that a right handed index is ok. So, indexing seems to be ruled out as an issue. You could of course - as suggested above - definitively rule this out by dropping the index and re running your bulk writes.

除了索引,我会……

  • 请考虑您的磁盘是否可以跟上您持久保存的数据量.Mongo docs
  • 使用 profiling 来了解你的文章发生了什么
  • Consider whether your disk can keep up with the volume of data you are persisting. More details on this in the Mongo docs
  • Use profiling to understand what’s happening with your writes

这篇关于MongoDB 中大型集合的批量插入性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆