关于 ArangoDB 中多索引的使用 [英] On multiple index usage in ArangoDB

查看:27
本文介绍了关于 ArangoDB 中多索引的使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具有以下结构的文档:

{
  path: String,
  enabled: Long,
  disabled: null || Long,
  // other fields...
}

我想通过路径前缀和一些数字和文档时间戳之间的数字关系的组合来查找文档:(伪代码)

I would like to look the documents up by a combination of path's prefix and numerical relation between some number and document's timestamps: (pseudocode)

SELECT e FROM entries 
WHERE 
  e.path STARTS WITH "somePrefix" 
  AND e.enabled <= timestamp 
  AND (
    e.disabled == null 
    OR timestamp < e.disabled
  )

我将从哪种索引结构中受益最多(如果有的话)?我应该在 enabled 字段上有一个非稀疏跳过列表索引 + 在 disabled 上有一个稀疏索引,在 path 上有一个全文非稀疏索引吗?ArangoDB 是否能够为这些类型的查询使用多个索引?我阅读了关于索引使用的文档页面,但我仍然不清楚.

What index structure will I benefit from the most, if any? Should I have a non-sparse skiplist index on enabled field + a sparse one on disabled and a fulltext non-sparse one on the path? Is ArangoDB capable of utilising multiple indexes for these types of queries? I read the doc page on index usage, but I'm still unclear.

推荐答案

如果过滤条件与logical or结合,并且索引满足,ArangoDB可以在同一个集合上使用多个索引> 或 分支条件.

ArangoDB can use multiple indexes on the same collection if the filter conditions are combined with logical or, and the indexes satisfy the or branches conditions.

就您的查询而言,您有三个条件与逻辑和组合在一起,后者包含一个.

In the case of your query, you have three conditions combined with logical and, with the latter containing an or.

AQL 中没有 STARTS WITH 谓词,但您可以使用由前缀边界构造的范围查询:e.path >= @lower &&e.path <@upper.对于 "somePrefix" 的搜索值,边界将转换为 @upper"somePrefix",并且@lowerbeing"somePrefiy"`(最后一个字符加一的搜索值).

There is no STARTS WITH predicate in AQL, but instead you could use a range query constructed with the prefix bounds: e.path >= @lower && e.path < @upper. For a search value of "somePrefix", the bounds would translate to @upper being "somePrefix", and@lowerbeing"somePrefiy"` (search value with last character increased by one).

path 上创建跳过列表索引将使查询使用该索引.

Creating a skiplist index on path will make the query use that index.

包括enabled的搜索条件,到目前为止的组合条件是e.path >= @lower &&e.path <@upper &&e.enabled <= @timestamp.虽然可以在多个属性上创建跳过列表索引,但它不会在 pathenabled 上使用,而只能在 path 上使用.反转索引属性的顺序(即先enabled,然后是path)也无济于事,因为这样索引将用于enabled仅,但不在 path 上.

Including the search condition for enabled, the combined condition up to now is e.path >= @lower && e.path < @upper && e.enabled <= @timestamp. Though a skiplist index can be created on multiple attributes, it won't be used here on both path and enabled, but only on path. Reversing the order of index attributes (i.e. enabled first, then path) won't help either, because then the index would be used on enabled only, but not on path.

通常,skiplist 索引将用于条件中可以产生连续范围的部分.如果在等式比较中使用最左边的索引属性就是这种情况(例如 e.path == @path && e.enabled <= @timestamp 可以工作),但是如果它最左边的索引属性是非等式比较(例如 e.path >= @lower && e.path <= @upper@e.enabled <;= @timestamp),则它不会查看其进一步的索引属性,因为无论如何它都会产生不连续的范围.

In general, the skiplist index will be used for the parts of the condition that can produce a contiguous range. That is the case if the left-most index attributes are used in equality comparisons (e.g. e.path == @path && e.enabled <= @timestamp would work), but if its left-most index attributes are non-equality comparisons (e.g. e.path >= @lower && e.path <= @upper or @e.enabled <= @timestamp), then it won't look at its further index attributes, because it would produce a non-contiguous range anyway.

还可以选择在 disabled 上创建跳过列表索引.这允许优化器在 e.enabled e.enabled <= @timestamp && 部分使用该索引.(e.disabled == null || @timestamp .它可以将其转换为 e.disabled == null ||@timestamp <e.disabled,但看起来不是很有选择性.

There's also the option of creating a skiplist index on disabled. That allows the optimizer to use that index on the part e.enabled <= @timestamp && (e.disabled == null || @timestamp < e.disabled). It can transform this to e.disabled == null || @timestamp < e.disabled, which however does not look very selective.

总而言之:对于该特定查询似乎没有好的索引选择.如果您可以以某种方式将 STARTS WITH 更改为相等比较,那么您可以在 pathenabled 上创建组合跳过列表索引,它会可能是相当有选择性的.如果您的 STARTS WITH 前缀总是相同的大小,那么将前缀保存在一个额外的属性中可能是值得的,该属性可以被索引而不是原始值并使用相等比较进行查询:e.pathPrefix == @prefix &&e.enabled <= @timestamp.这需要为每个文档保存和维护一个额外的前缀属性,但在启用更具选择性的索引时可能是值得的.

In summary: there does not seem to be a good index choice for that particular query. If you could somehow change the STARTS WITH to an equality comparison, then you could create a combined skiplist index on path and enabled, and it would potentially be rather selective. If your STARTS WITH prefixes are always the same size, it might be worth to save the prefix in an extra attribute, that can be indexed instead of the original value and be queried using an equality comparison: e.pathPrefix == @prefix && e.enabled <= @timestamp. That requires saving and maintaining an extra prefix attribute per document, but maybe its worth it when enabling the use of a much more selective index.

这篇关于关于 ArangoDB 中多索引的使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆