Mongo DB 更新查询性能 [英] Mongo DB update query performance
问题描述
我想了解在 mongo db 中进行更新时,以下哪个查询会更快?我想一次更新几千条记录.
累积这些记录的对象 ID 并使用 $in 或使用批量更新触发它们?
使用集合中的一两个字段,这些字段对于那几千条记录很常见 - 类似于 sql 中的where"并使用这些字段触发更新.这些字段可能会也可能不会被索引.
我知道在第二种情况下查询会小得多,因为每个_id"(oid)都没有累积.累积 _ids 并使用它们来更新文档是否提供任何实际的性能优势?
<块引用>累积 _ids 并使用它们来更新文档是否提供任何实际的性能优势?
是的,因为 MongoDB 肯定会使用 _id 索引 (idhack
).
在第二种方法中 - 正如您所观察到的 - 您无法判断某个字段是否会使用索引.
所以答案是:视情况而定.
如果您的集合包含数百万个或更多文档,和/或搜索字段的数量非常大,您应该首选第一种搜索方法.特别是如果 id 列表大小不小和/或 id 值相邻.
如果您的收藏很小并且您可以接受完整扫描,您可能更喜欢第二种方法.
在任何情况下,您都应该使用 explain()
来证明这两种方法.
I would like to understand which of the below queries would be faster, while doing updates, in mongo db? I want to update few thousands of records at one stretch.
Accumulating the object ids of those records and firing them using $in or using bulk update?
Using one or two fields in the collection which are common for those few thousand records - akin to "where" in sql and firing an update using those fields. These fields might or might not be indexed.
I know that query will be much smaller in the 2nd case as every single "_id" (oid) is not accumulated. Does accumulating _ids and using those to update documents offer any practical performance advantages?
Does accumulating _ids and using those to update documents offer any practical performance advantages?
Yes because MongoDB will certainly use the _id index (idhack
).
In the second method - as you observed - you can't tell whether or not an index will be used for a certain field.
So the answer will be: it depends.
If your collection has million of documents or more, and / or the number of search fields is quite large you should prefer the first search method. Especially if the id list size is not small and / or the id values are adjacent.
If your collection is pretty small and you can tolerate a full scan you may prefer the second approach.
In any case, you should testify both methods using explain()
.
这篇关于Mongo DB 更新查询性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!