存在大量字段组合时在MongoDB中建立索引的正确方法是什么 [英] What is the correct way to Index in MongoDB when big combination of fields exist

查看:111
本文介绍了存在大量字段组合时在MongoDB中建立索引的正确方法是什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑到我的搜索面板包含多个选项,如下图所示:

Considering I have search pannel that inculude multiple options like in the picture below:

我正在使用mongo,并按特定顺序在3-4个属性上创建复合索引. 但是,当我运行不同的搜索组合时,执行计划(explain())每次都会看到不同的顺序.有时我在Collection扫描中看到它(坏),有时它恰好适合索引(IXSCAN).

I'm working with mongo and create compound index on 3-4 properties with specific order. But when i run a different combinations of searches i see every time different order in execution plan (explain()). Sometime i see it on Collection scan (bad) , and sometime it fit right to the index (IXSCAN).

mongo索引应处理的选择性字段为:(brand,Types,Status,Warehouse,Carries ,Search - only by id)

The selective fields that should handle by mongo indexes are:(brand,Types,Status,Warehouse,Carries ,Search - only by id)

我的问题是:

我是否必须使用不同顺序创建所有字段的所有组合,它可以是10-20个复合索引.或1-3大复合指数,但同样不能解决顺序.

Do I have to create all combination with all fields with different order , it can be 10-20 compound indexes. Or 1-3 big Compound Index , but again it will not solve the order.

处理各种各样的领域组合的最佳策略是什么.

What is the best strategy to deal with big various of fields combinations.

我使用具有不同组合对的相同结构查询

I use same structure queries with different combinations of pairs

// Example Query. 
// fields could be different every time according to user select (and order) !!

 db.getCollection("orders").find({
  '$and': [
    {
      'status': {
        '$in': [
          'XXX',
          'YYY'
        ]
      }
    },
    {
      'searchId': {
        '$in': [
          '3859447'
        ]
      }
    },
    {
      'origin.brand': {
        '$in': [
          'aaaa',
          'bbbb',
          'cccc',
          'ddd',
          'eee',
          'bundle'
        ]
      }
    },
    {
      '$or': [
        {
          'origin.carries': 'YYY'
        },
        {
          'origin.carries': 'ZZZ'
        },
        {
          'origin.carries': 'WWWW'
        }
      ]
    }
  ]
}).sort({"timestamp":1})

// My compound index is:
{status:1 ,searchId:-1,origin.brand:1, origin.carries:1 , timestamp:1}

但只有1种组合...可能很像

but it only 1 combination ...it could be plenty like

a. {status:1} {b.status:1 ,searchId:-1} {c. status:1 ,searchId:-1,origin.brand:1} {d.status:1 ,searchId:-1,origin.brand:1, origin.carries:1} ........

此外,性能写入/读取会发生什么? ,我认为写入次数会比读取次数减少...

Additionally , What will happened with Performance write/read ? , I think write will decreased over reads ...

查询模式是:

1. find(...) with '$and'/'$or' + sort

2. Aggregation with Match/sort

谢谢

推荐答案

因此,您只有子文档,范围内的查询以及仅按1个字段排序.

So you have subdocuments, ranged queries, and sorting by 1 field only.

它可以消除大多数可能的排列.假设没有其他惊喜.

It can eliminate most of the possible permutations. Assuming there are no other surprises.

D. SM已经涵盖了选择性-您应该真正听男人说的话,至少要投票赞成.

D. SM already covered selectivity - you should really listen what the man says and at least upvote.

要考虑的其他事项是复合索引中字段的顺序:

The other things to consider is the order of the fields in the compound index:

  1. 具有直接匹配项的字段,例如$ eq
  2. 您排序的字段
  3. 具有查询范围的字段:$ in,$ lt,$ or等

这些是所有b树的通用规则.现在是mongo特有的东西:

These are common rules for all b-trees. Now things that are specific to mongo:

一个复合索引最多只能有1个多键索引-诸如"origin.brand"之类的子文档中的字段索引.再一次,我假设原点是嵌入式文档,因此文档的形状是这样的:

A compound index can have no more than 1 multikey index - the index by a field in subdocuments like "origin.brand". Again I assume origins are embedded docs, so the document's shape is like this:

{
    _id: ...,
    status: ...,
    timestamp: ....,
    origin: [
        {brand: ..., carries: ...},
        {brand: ..., carries: ...},
        {brand: ..., carries: ...}
    ]
}

对于您的查询,最佳索引将是

For your query the best index would be

{
  searchId: 1,
  timestamp: 1,
  status: 1, /** only if it is selective enough **/
  "origin.carries" : 1 /** or brand, depending on data **/
}

关于索引数-它取决于数据大小.确保所有索引都适合RAM,否则会很慢.

Regarding the number of indexes - it depends on data size. Ensure all indexes fit into RAM otherwise it will be really slow.

最后但并非最不重要的-索引编制不是一项工作,而是一种生活方式.数据随时间变化,查询也随时间变化.如果您关心性能并拥有有限的资源,则应密切注意数据库.检查慢速查询以添加新索引,从用户查询中收集统计信息以删除未使用的索引并释放一些空间.基本上应用常识.

Last but not least - indexing is not a one off job but a lifestyle. Data change over time, so do queries. If you care about performance and have finite resources you should keep an eye on the database. Check slow queries to add new indexes, collect stats from user's queries to remove unused indexes and free up some room. Basically apply common sense.

这篇关于存在大量字段组合时在MongoDB中建立索引的正确方法是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆