MongoDb:如何为具有许多可搜索字段的数据创建正确(复合)索引 [英] MongoDb: how to create the right (composite) index for data with many searchable fields

查看:169
本文介绍了MongoDb:如何为具有许多可搜索字段的数据创建正确(复合)索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:我需要补充一点,这个问题的关键是允许我为Json Rest Stores定义模式。用户可以通过任何一个键或几个键进行搜索。因此,我无法轻易预测用户将搜索的内容 - 它可能是1,2,5个字段(对于人员,预订等数据丰富的字段尤其如此)

UPDATE: I need to add that the point of this question is to allow me to define schemas for Json Rest Stores. The user can search by any one key, or several keys. So, I cannot easily predict what the users will search by -- it could be 1, 2, 5 fields (this is especially true for data-rich fields like people, bookings, etc.)

想象一下,我有一个索引:

Imagine that I have an index as such:

{ "item": 1, "location": 1, "stock": 1 }

关于索引的MongoDb手册


MongoDB可以使用此索引来支持包含以下内容的查询:

MongoDB can use this index to support queries that include:


  • 项目字段,

  • 项目字段和位置字段,

  • 项目字段和位置字段以及库存字段,或

  • 仅项目和库存领域;但是,这个索引的效率低于只有项目和股票的索引。

  • the item field,
  • the item field and the location field,
  • the item field and the location field and the stock field, or
  • only the item and stock fields; however, this index would be less efficient than an index on only item and stock.

MongoDB不能使用此索引来支持包含查询的查询:

MongoDB cannot use this index to support queries that include:


  • 只有位置字段,

  • 只有股票字段,或

  • 只有位置和库存字段。

现在,假设我有一个具有以下字段的模式:

Now, suppose I have a schema with exactly these fields:


  • item:String

  • location:String

  • 股票:字符串

  • 数量:数字

  • item: String
  • location: String
  • stock: String
  • qty: number

想象一下我想要确保每个查询确实已编入索引。我愿意:

And imagine I want to make sure every query is indeed indexed. I would do:

item


  • item,location,stock,qty

  • item,location,qty,股票

  • 商品,股票,数量,位置

  • 商品,库存,地点,数量

  • 商品,数量,地点,库存

  • item,qty,stock,location

  • item, location, stock, qty
  • item, location, qty, stock
  • item, stock, qty, location
  • item, stock, location, qty
  • item, qty, location, stock
  • item, qty, stock, location

对于位置


  • ...你知道要点

现在......这似乎有些疯狂。如果你有一个数据库,你有一个TEN可搜索字段,随着索引数量呈指数增长,这显然是行不通的。

Now... this seems a little insane. If you have a database where you have TEN searchable fields, this becomes clearly unworkable as the number of indexes grows exponentially.

我错过了什么?我的想法是定义一个模式,定义哪些字段是可搜索的,并编写一个函数来组成所有需要的索引,而不管哪些字段存在以及哪些字段不存在。但是,我正在考虑它,而且......好吧,我必须遗漏一些东西。

Am I missing something? My idea was to define a schema, define which fields were searchable, and write a function that makes up all of the needed indexes regardless of what fields were present and what fields weren't. However, I am thinking about it, and... well, I must be missing something.

我是吗?

推荐答案

您的真实查询模式是什么?您不太可能需要创建所有这些可能的索引组合。我也怀疑在索引中包含 qty 会有多大用处。你需要搜索qty == 4的东西,而不管位置和项目类型吗?

What are your real query patterns? It's very unlikely that you would need to create all of these possible index combinations. I also doubt that including qty in the index would be of much use. Do you need to search for things where qty == 4 independent of location and item type?

索引不需要识别每一条记录,它只需要具体到足以使任何最终扫描变小。鉴于项目代码或股票价值是否真的有很多地方你还需要为它们编制索引?

An index doesn't need to identify every single record, it just needs to be specific enough to make any final scan small. Given an item code or a stock value are there really that many locations that you'd also need to index on them?

我怀疑在这种情况下是<$的索引c $ c> item , location 上的索引以及 stock 上的索引将是足以以足够的速度回答最可能的查询。 (但我们需要更多地了解这些字段名称的含义以及值的计数和分布情况)。

I suspect in this case an index on item, an index on location and and index on stock would be sufficient to answer most likely queries with sufficient speed. (But we'd need to know more about what these field names mean and what the count and distribution of values is within them).

使用用您的查询解释,您可以看到他们的表现如何。根据需要添加索引,不要创建所有可能的顺序。

Use explain with your queries and you can see how well they are performing. Add indices as necessary, don't create every possible ordering.

这篇关于MongoDb:如何为具有许多可搜索字段的数据创建正确(复合)索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆