关于ArangoDB中的多个索引用法 [英] On multiple index usage in ArangoDB

查看:414
本文介绍了关于ArangoDB中的多个索引用法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具有以下结构的文档:

{
  path: String,
  enabled: Long,
  disabled: null || Long,
  // other fields...
}

我想通过路径前缀和一些数字与文档时间戳之间的数字关系的组合来查找文档:(伪代码)

I would like to look the documents up by a combination of path's prefix and numerical relation between some number and document's timestamps: (pseudocode)

SELECT e FROM entries 
WHERE 
  e.path STARTS WITH "somePrefix" 
  AND e.enabled <= timestamp 
  AND (
    e.disabled == null 
    OR timestamp < e.disabled
  )

如果有的话,我将从中受益最大的是什么索引结构?我应该在enabled字段上有一个非稀疏的跳过列表索引,还是在disabled上有一个稀疏的索引,而在path上是全文非稀疏的索引? ArangoDB是否能够对这些类型的查询使用多个索引?我阅读了有关索引使用情况的文档页面,但仍不清楚.

What index structure will I benefit from the most, if any? Should I have a non-sparse skiplist index on enabled field + a sparse one on disabled and a fulltext non-sparse one on the path? Is ArangoDB capable of utilising multiple indexes for these types of queries? I read the doc page on index usage, but I'm still unclear.

推荐答案

如果过滤条件与 logic或组合,并且索引满足分支条件.

ArangoDB can use multiple indexes on the same collection if the filter conditions are combined with logical or, and the indexes satisfy the or branches conditions.

对于您的查询,您将三个条件与逻辑和组合在一起,后者包含一个.

In the case of your query, you have three conditions combined with logical and, with the latter containing an or.

AQL中没有STARTS WITH谓词,但是您可以使用以前缀范围e.path >= @lower && e.path < @upper构造的范围查询.对于"somePrefix"的搜索值,边界将转换为@upper"somePrefix", and @lower being"somePrefiy"`(最后一个字符增加一个的搜索值).

There is no STARTS WITH predicate in AQL, but instead you could use a range query constructed with the prefix bounds: e.path >= @lower && e.path < @upper. For a search value of "somePrefix", the bounds would translate to @upper being "somePrefix", and@lowerbeing"somePrefiy"` (search value with last character increased by one).

path上创建一个跳过列表索引将使查询使用该索引.

Creating a skiplist index on path will make the query use that index.

包括enabled的搜索条件,到目前为止的组合条件是e.path >= @lower && e.path < @upper && e.enabled <= @timestamp.尽管可以在多个属性上创建一个跳过列表索引,但此处不会同时在pathenabled上使用它,而仅在path上使用.颠倒索引属性的顺序(即先颠倒enabled,然后颠倒path)也无济于事,因为那样的话,索引只能在enabled上使用,而不能在path上使用.

Including the search condition for enabled, the combined condition up to now is e.path >= @lower && e.path < @upper && e.enabled <= @timestamp. Though a skiplist index can be created on multiple attributes, it won't be used here on both path and enabled, but only on path. Reversing the order of index attributes (i.e. enabled first, then path) won't help either, because then the index would be used on enabled only, but not on path.

通常,跳过列表索引将用于条件条件中可以产生连续范围的部分.如果最左边的索引属性用于相等比较(例如,e.path == @path && e.enabled <= @timestamp将起作用),但是如果其最左边的索引属性是非相等比较(例如,e.path >= @lower && e.path <= @upper@e.enabled <= @timestamp),则是这种情况.它不会查看其进一步的索引属性,因为无论如何它都会产生不连续的范围.

In general, the skiplist index will be used for the parts of the condition that can produce a contiguous range. That is the case if the left-most index attributes are used in equality comparisons (e.g. e.path == @path && e.enabled <= @timestamp would work), but if its left-most index attributes are non-equality comparisons (e.g. e.path >= @lower && e.path <= @upper or @e.enabled <= @timestamp), then it won't look at its further index attributes, because it would produce a non-contiguous range anyway.

还可以选择在disabled上创建一个跳过列表索引.这使优化器可以在部件e.enabled <= @timestamp && (e.disabled == null || @timestamp < e.disabled)上使用该索引.可以将其转换为e.disabled == null || @timestamp < e.disabled,但是看起来不太有选择性.

There's also the option of creating a skiplist index on disabled. That allows the optimizer to use that index on the part e.enabled <= @timestamp && (e.disabled == null || @timestamp < e.disabled). It can transform this to e.disabled == null || @timestamp < e.disabled, which however does not look very selective.

总结:对于该特定查询,似乎没有很好的索引选择.如果可以通过某种方式将STARTS WITH更改为相等比较,则可以在pathenabled上创建组合的跳过列表索引,这可能是相当有选择性的. 如果STARTS WITH前缀的大小始终相同,则可能值得将该前缀保存在一个额外的属性中,该属性可以被索引而不是原始值,并可以使用等式比较来查询:e.pathPrefix == @prefix && e.enabled <= @timestamp. 这需要为每个文档保存和维护一个额外的前缀属性,但是当启用使用更具选择性的索引时,也许值得这样做.

In summary: there does not seem to be a good index choice for that particular query. If you could somehow change the STARTS WITH to an equality comparison, then you could create a combined skiplist index on path and enabled, and it would potentially be rather selective. If your STARTS WITH prefixes are always the same size, it might be worth to save the prefix in an extra attribute, that can be indexed instead of the original value and be queried using an equality comparison: e.pathPrefix == @prefix && e.enabled <= @timestamp. That requires saving and maintaining an extra prefix attribute per document, but maybe its worth it when enabling the use of a much more selective index.

这篇关于关于ArangoDB中的多个索引用法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆