MongoDB:模式迁移,更新或插入 [英] MongoDB: schema migration, update or insert

查看:103
本文介绍了MongoDB:模式迁移,更新或插入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

虽然MongoDB不需要任何固定的架构,但有时我们还是希望从一种结构迁移到另一种结构.

While MongoDB doesn't require any fixed schema, there are times we would like to migrate from one structure to another.

最近我正在处理一个小的数据集(约200K),并决定循环现有数据,转换数据模型并插入到新集合中.事实证明,我们的vps并不是那么强大,使用php驱动程序,在确保以下条件后,我只能达到约300次插入/秒:

I was dealing with a small dataset (~200K) recently, and decided to loop existing data, transform data model and insert to new collections. It turned out our vps wasn't that powerful, using php driver I can only get to about ~300 insertion/sec, after having ensured following:

  • 插入前没有索引.
  • 尽可能使用批处理插入.

我想知道我是否只是选择了错误的迁移路径,还是在MongoDB中处理架构更改时是否有一些最佳实践?

I wonder if I have simply picked the wrong migration path, or if there are some best practice when dealing with schema change in MongoDB?

在接受了一些建议之后,我在迁移过程中将写关注点更改为0,这就是我观察到的:

After taking in some suggestions, I changed the write concern to 0 during migration, and this is what i have observed:

  • 插入仍然不如预期的快,最大插入速度为〜500次/秒
  • 插入完成后,建立索引的速度非常快,这可能是由于ensureIndexw=0抛弃了吗?
  • 剩余的更新花了一段时间才开始,这可能是由于索引操作受阻的缘故?然后它似乎以不同的速度运行(以前运行速度一直很慢),也许索引又在发生.
  • CPU和IO都很好. cpu大部分闲置率约为90%,IO等待时间不到10%.
  • Insertions are still not as fast as expected, max at ~500 insertion/sec
  • After insertion completed, indexing go through very quickly, probably due to the fact the ensureIndex is fire-and-forget with w=0?
  • Remaining update took a while to start, probably due to the fact the indexing operations are blocking? Then it appear to ran at a varying speed (previously it was running consistently slower), again maybe indexing were taking place.
  • CPU and IO were fine. cpu mostly had about 90% idle, and IO wait were less than 10%.

除了不使用我们的PHP ORM进行迁移外,还有更多的优化可能吗?

Besides not using our PHP ORM for the migration, are there more possibility for optimization?

推荐答案

向php客户端传输所有内容并从中进行序列化可能会增加很多开销.从外壳运行迁移将是最快的.使用更新来写它们或使用带有 forEach 的游标进行迭代和调用保存.

Transmitting and serializing everything to and from the php client is probably adding a lot of overhead. Running migrations from the shell is going to be the fastest. Write them with an update or use a cursor with forEach to iterate and make calls to save.

查看使用游标的示例 MongoDB更新多个数组记录(朝底部).

See an example of using cursors MongoDB update multiple records of array (towards the bottom).

请注意有关游标的快照问题.如果不对集合进行分片,则可能需要幂等更新或使用快照.

Be aware of snapshot issues with cursors. Probably want an idempotent update or use snapshot if the collection isn't sharded.

这篇关于MongoDB:模式迁移,更新或插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆