_id(在mongo)中的重复文件 [英] Duplicate documents on _id (in mongo)
问题描述
我对这个集合做了很多写操作,使用Perl驱动程序(插入,更新,删除,保存)和mongoimport。
我的问题是,在某种程度上,我在同一个_id上有重复的文档。
我删除了重复的内容,但其他的仍然出现。
你有什么想法可以来自哪里,或者我应该开始看什么?
(另外,我试图复制一个较小的测试集合,但不会插入任何重复项,无论执行什么写操作)。
这实际上并不是Perl驱动程序的问题,它与分片的特性有关。 MongoDB只能在创建时在单个分片上的文档中强制执行唯一性,因此默认索引不需要唯一性。
在 MongoDB:配置Sharding 文档特别提到:
-
当您分割集合时,必须指定分片键。如果集合中有数据,mongo将需要先创建索引(加快分块过程);
-
您可以使用{unique:true}选项确保基础索引强制执行唯一性,只要唯一索引是分片键的前缀。
-
如果未使用unique:true选项,则分片键不具有是唯一的。
I have a sharded mongo collection, with over 1.5 mil documents. I use the _id column as a shard key, and the values in this column are integers (rather than ObjectIds).
I do a lot of write operations on this collection, using the Perl driver (insert, update, remove, save) and mongoimport.
My problem is that somehow, I have duplicate documents on the same _id. From what I've read, this shouldn't be possible.
I've removed the duplicates, but others still appear.
Do you have any ideas where could they come from, or what should I start looking at? (Also, I've tried to replicate this on a smaller, test collection, but no duplicates are inserted, no matter what write operation I perform).
This actually isn't a problem with the Perl driver .. it is related to the characteristics of sharding. MongoDB is only able to enforce uniqueness among the documents located on a single shard at the time of creation, so the default index does not require uniqueness.
In the MongoDB: Configuring Sharding documentation there is specific mention that:
When you shard a collection, you must specify the shard key. If there is data in the collection, mongo will require an index to be created upfront (it speeds up the chunking process); otherwise, an index will be automatically created for you.
You can use the {unique: true} option to ensure that the underlying index enforces uniqueness so long as the unique index is a prefix of the shard key.
If the "unique: true" option is not used, the shard key does not have to be unique.
这篇关于_id(在mongo)中的重复文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!