mongo 3重复的唯一索引 - dropDups [英] mongo 3 duplicates on unique index - dropDups

查看:301
本文介绍了mongo 3重复的唯一索引 - dropDups的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



有什么我可以如果我真的要创建一个独特的索引并销毁重复的条目,那么(除了降级)?



请记住,我每秒接收大约300个插入,所以我可以不要删除所有重复的内容,希望在完成索引编制之后不会有任何内容。

解决方案

dropDupes 现在是已删除,因为不能正确地预测哪个文档将是在此过程中删除。



通常,您有两个选项:


  1. 使用新的集合:




    • 创建一个新集合

    • 创建唯一索引这个新的集合

    • 运行一个批次将所有文档从旧集合复制到新集合,并确保在此过程中忽略重复的密钥错误。


  2. 手动处理您自己的收藏




    • 确保您不要在代码中插入更多重复的文档,

    • 在您的集合上运行批处理以删除重复项(并确保您保持良好的一个如果它们不完全相同),

    • 然后添加唯一索引。


对于您的具体情况,我建议使用第一个选项,但有一个技巧:




  • 创建一个具有唯一索引的新集合

  • 更新代码,以便现在在
  • $ b $中插入文档b
  • 运行批次将所有文件从旧集合复制到新集合(忽略重复的密钥错误),

  • 重命名新集合以匹配旧名称。 >
  • 重新更新您的代码,以便您现在只能在旧集合中写入


In the documentation for mongoDB it says: "Changed in version 3.0: The dropDups option is no longer available."

Is there anything I can do (other than downgrading) if I actually want to create a unique index and destroy duplicate entries?

please keep in mind the I receive about 300 inserts per second so I can't just delete all duplicates and hope none will come in by the time I'm done indexing.

解决方案

Yes dropDupes is now deprecated since version 2.7.5 because it was not possible to predict correctly which document would be deleted in the process.

Typically, you have 2 options :

  1. Use a new collection :

    • Create a new collection,
    • Create the unique index on this new collection,
    • Run a batch to copy all the documents from the old collection to the new one and make sure you ignore duplicated key error during the process.
  2. Deal with it in your own collection manually :

    • make sure you won't insert more duplicated documents in your code,
    • run a batch on your collection to delete the duplicates (and make sure you keep the good one if they are not completely identical),
    • then add the unique index.

For your particular case, I would recommend the first option but with a trick :

  • Create a new collection with unique index,
  • Update your code so you now insert documents in both tables,
  • Run a batch to copy all documents from the old collection to the new one (ignore duplicated key error),
  • rename the new collection to match the old name.
  • re-update your code so you now write only in the "old" collection

这篇关于mongo 3重复的唯一索引 - dropDups的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆