Mongodb更新有限数量的文档 [英] Mongodb update limited number of documents

查看:64
本文介绍了Mongodb更新有限数量的文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的收藏有1亿个文档.我想安全地更新许多文档(安全地,我的意思是仅在尚未更新文档的情况下才对其进行更新).在Mongo中有有效的方法吗?

I have a collection with 100 million documents. I want to safely update a number of the documents (by safely I mean update a document only if it hasn't already been updated). Is there an efficient way to do it in Mongo?

我打算将$ isolated运算符与limit子句一起使用,但是mongo不支持对更新的限制.

I was planning to use the $isolated operator with a limit clause but it appears mongo doesn't support limiting on updates.

这似乎很简单,但我被困住了.任何帮助将不胜感激.

This seems simple but I'm stuck. Any help would be appreciated.

推荐答案

每个Sammaye,似乎都没有适当"的方法来执行此操作. 我的解决方法是按照mongo网站上的说明创建序列,只需将"seq"字段添加到我的收藏夹中的每条记录.现在,我有一个唯一的字段,可以可靠地对其进行更新.

Per Sammaye, it doesn't look like there is a "proper" way to do this. My workaround was to create a sequence as outlined on the mongo site and simply add a 'seq' field to every record in my collection. Now I have a unique field which is reliably sortable to update on.

这里可靠地排序很重要.我只想对自动生成的_id进行排序,但是我很快意识到自然顺序与ObjectId的升序不同(来自

Reliably sortable is important here. I was going to just sort on the auto-generated _id but I quickly realized that natural order is NOT the same as ascending order for ObjectId's (from this page it looks like the string value takes precedence over the object value which matches the behavior I observed in testing). Also, it is entirely possible for a record to be relocated on disk which makes the natural order unreliable for sorting.

因此,现在我可以查询具有最小"seq"的记录,该记录尚未更新以获得包含性的起点.接下来,我查询'seq'大于我的起点的记录,然后跳过(要跳过,这很重要,因为如果删除文档等,'seq'可能很稀疏...)我想要更新的记录数.在该查询上设置限制1,您将获得一个非包含端点.现在,我可以发出查询'updated'= 0,'seq'> =我的起点且<我的终点.假设没有其他线程击败我,那么更新应该会给我我想要的东西.

So now I can query for the record with the smallest 'seq' which has NOT already been updated to get an inclusive starting point. Next I query for records with 'seq' greater than my starting point and skip (it is important to skip since the 'seq' may be sparse if you remove documents, etc...) the number of records I want to update. Put a limit of 1 on that query and you've got a non-inclusive endpoint. Now I can issue an update with a query of 'updated' = 0, 'seq' >= my starting point and < my endpoint. Assuming no other thread has beat me to the punch the update should give me what I want.

这又是步骤:

  1. 使用findAndModify创建自动递增序列
  2. 在您的集合中添加一个使用自动递增序列的字段
  3. 查询以找到合适的起点:db.xx.find({已更新:0}).sort({seq:1}).limit(1)
  4. 查询以找到合适的端点:db.xx.find({seq:{$ gt:startSeq}}).sort({seq:1}).skip(updateCount).limit(1)
  5. 使用起点和终点更新集合:db.xx.update({已更新:0,seq:{$ gte:startSeq},seq:{$ lt:endSeq},$ isolated:1},{已更新:1},{multi:true})

非常痛苦,但是可以完成工作.

Pretty painful but it gets the job done.

这篇关于Mongodb更新有限数量的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆