为什么MongoDB不使用索引交集? [英] Why doesn't MongoDB use index intersection?

查看:202
本文介绍了为什么MongoDB不使用索引交集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个包含单个集合的数据库,该集合只存储了2个字段(和一个id)的文档:

I created a database with a single collection that stores documents with only 2 fields (and an id):

public class Hamster
{
    public ObjectId Id;
    public string Name;
    public int Age;
}

我还为每个字段创建了一个索引。

I also created an index for each field.

当我在两个字段上执行查询过滤时,我希望它使用索引交集以减少集合扫描并提高性能。这从不的情况。我还没有设法引发索引交集。

When I execute a query filtering on both fields I expect it to combine both indexes using Index Intersection to reduce the collection scanning and improve performance. This is never the case. I haven't yet managed to induce an index intersection.

那么,什么停止 MongoDB 应用索引交集?

So, what stops MongoDB from applying index intersection?

推荐答案

当您使用 explain(true) 您可以看到优化器考虑使用索引交集并选择不:

When you use explain(true) you can see that the optimizer considers using index intersection and chooses not to:

"cursor" : "BtreeCursor Age", // Chosen plan.
...
"allPlans" : [
   {
       "cursor" : "BtreeCursor Age",
       ...
   },
   {
       "cursor" : "BtreeCursor Name",
       ...
   },
   {
       "cursor" : "Complex Plan", // Index intersection.
       ...
   }
]

如果有足够的复合索引,MongoDB 永远不会选择交集。其他限制可以在索引交集的Jira票证

MongoDB will never choose intersection if there's a sufficient compound index. Other limitations can be found on the Jira ticket for Index Intersection:


查询优化器可以选择索引交叉点计划当满足以下条件时:

1.相关集合中的大多数文档都是磁盘驻留的。索引交集的优点是,当交叉点的大小很小时,它可以避免获取完整的文档。如果文档已经在内存中,则通过避免提取没有任何好处。

2.查询谓词是单点间隔,而不是范围谓词或一组间隔。单点间隔的查询返回按磁盘位置排序的文档,这允许优化器选择以非阻塞方式计算交集的计划。这通常比计算交集的替代模式更快,即使用来自一个索引的结果构建哈希表,然后使用来自第二个索引的结果来探测它。

3.两个都没有要交叉的指数具有高度选择性。如果其中一个索引是选择性的,那么优化器将选择一个简单扫描该选择性索引的计划。

4.相对于单个索引扫描的索引键的数量,交集的大小很小解。在这种情况下,查询执行程序可以使用索引交集来查看较小的文档集,这可能使我们从磁盘中获得更少的提取的好处。

The query optimizer may select index intersection plans when the following conditions hold:
1. Most of the documents in the relevant collection are disk-resident. The advantage of index intersection is that it can avoid fetching complete documents when the size of the intersection is small. If the documents are already in memory, there is nothing to gain by avoiding fetches.
2. The query predicates are single point intervals, rather than range predicates or a set of intervals. Queries over single point intervals return documents sorted by disk location, which allows the optimizer to select plans that compute the intersection in a non-blocking fashion. This is generally faster than the alternative mode of computing the intersection, which is to build a hash table with the results from one index, and then probe it with the results from the second index.
3. Neither of the indices to be intersected are highly selective. If one of the indices is selective then the optimizer will choose a plan which simply scans this selective index.
4. The size of the intersection is small relative to the number of index keys scanned by either single-index solution. In this case the query executor can look at a smaller set of documents using index intersection, potentially allowing us to reap the benefits of fewer fetches from disk.

MongoDB 对交叉点有很多限制,使其实际使用的可能性降低。

MongoDB has many limitations on intersection which makes it less likely to be actually used.

这篇关于为什么MongoDB不使用索引交集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆