即使在添加唯一键后 MongoDB 重复文档 [英] MongoDB Duplicate Documents even after adding unique key

查看:29
本文介绍了即使在添加唯一键后 MongoDB 重复文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个集合并添加了一个这样的唯一键

I have created a collection and added a unique key like this

db.user_services.createIndex({"uid":1 , "sid": 1},{unique:true,dropDups: true})

集合看起来像这样用户服务"

The collection looks something like this "user_services"

{
 "_id" : ObjectId("55068b35f791c7f81000002d"),
 "uid" : 15,
 "sid" : 1,
 "rate" : 5
},
{

 "_id" : ObjectId("55068b35f791c7f81000002f"),
 "uid" : 15,
 "sid" : 1,
 "rate" : 4
}

问题:

我正在使用 php 驱动程序插入具有相同 uid 和 sid 的文档,并且它正在被插入.

Am using php driver to insert documents with same uid and sid and it is getting inserted.

我想要的

  1. 在 Mongo Shell 上:在 uid 和 sid 上添加唯一键,没有具有相同 uid 和 sid 的重复文档.
  2. 在 PHP 端:有类似 mysql 插入(值)重复键更新率=rate+1"之类的东西.也就是说,每当我尝试插入一个文档时,如果不插入它就应该插入,否则它应该更新文档的速率字段
  1. On Mongo Shell : Add unique key on uid and sid with no duplicate documents with the same uid and sid.
  2. On PHP Side : having something like mysql "insert (value) on duplicate key update rate=rate+1". That is whenever I try to insert a document, it should be inserted if not there else it should update the rate field of the document

推荐答案

恭喜,您似乎发现了一个错误.在我的测试中,这只发生在 MongoDB 3.0.0 中,或者至少在 MongoDB 2.6.6 中不存在.错误现在记录在 SERVER-17599

注意:实际上不是问题",而是按设计"确认.删除了 3.0.0 版的选项.但仍列在文档中.

NOTE: Not actually an "issue" but confirmed "by design". Dropped the option for version 3.0.0. Still listed in the documentation though.

问题是当您尝试在复合键"字段上存在重复项的集合上创建索引时,未创建索引并出现错误.在上面,索引创建应该在 shell 中产生这个:

The problem is that the index is not being created and errors when you attempt to create this on a collection with existing duplicates on the "compound key" fields. On the above, the index creation should yield this in the shell:

{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "errmsg" : "exception: E11000 duplicate key error dup key: { : 15.0, : 1.0 }",
    "code" : 11000,
    "ok" : 0
}

当不存在重复项时,您可以创建当前正在尝试的索引,它将被创建.

When there are no duplicates present you can create the index as you are currently trying and it will be created.

因此要解决此问题,请先使用如下程序删除重复项:

So to work around this, first remove the duplicates with a procedure like this:

db.events.aggregate([
    { "$group": {
        "_id": { "uid": "$uid", "sid": "$sid" },
        "dups": { "$push": "$_id" },
        "count": { "$sum": 1 }
    }},
    { "$match": { "count": { "$gt": 1 } }}
]).forEach(function(doc) {
    doc.dups.shift();
    db.events.remove({ "_id": {"$in": doc.dups }});
});

db.events.createIndex({"uid":1 , "sid": 1},{unique:true})

然后将不会插入包含重复数据的进一步插入并记录相应的错误.

Then further inserts containing duplicate data will not be inserted and the appropriate error will be recorded.

这里的最后一点是,dropDups"不是/不是用于删除重复数据的非常优雅的解决方案.如上所示,您确实希望获得更多控制权.

The final note here is that "dropDups" is/was not a very elegant solution for removing duplicate data. You really want something with more control as demonstrated above.

对于第二部分,不要使用 .insert() 而是使用 .update() 方法.它有一个 "upsert" 选项

For the second part, rather than use .insert() use the .update() method. It has an "upsert" option

$collection->update(
    array( "uid" => 1, "sid" => 1 ),
    array( '$set' => $someData ),
    array( 'upsert' => true )
);

所以找到"的文件被修改"了,没有找到的文件被插入"了.另请参阅 $setOnInsert 一种仅在实际插入文档时创建特定数据的方法,而不是在修改时创建.

So the "found" documents are "modified" and the documents not found are "inserted". Also see $setOnInsert for a way to only create certain data when the document is actually inserted and not when modified.

对于您的特定尝试,.update() 的正确语法是三个参数.查询"、更新"和选项":

For your specific attempt, the correct syntax of .update() is three arguments. "query", "update" and "options":

$collection->update(
    array( "uid" => 1, "sid" => 1 ),
    array(
        '$set' => array( "field" => "this" ),
        '$inc' => array( "counter" => 1 ),
        '$setOnInsert' => array( "newField" => "another" )
   ),
   array( "upsert" => true )
);

不允许任何更新操作访问相同路径",因为该路径在该更新"文档部分中的另一个更新操作中使用.

None of the update operations are allowed to "access the same path" as used in another update operation in that "update" document section.

这篇关于即使在添加唯一键后 MongoDB 重复文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆