ReversedWildcardFilterFactory如何加速通配符搜索? [英] How does ReversedWildcardFilterFactory speed up wildcard searches?
问题描述
Solr 文档说:
solr.ReversedWildcardFilterFactory
一种过滤器,用于反转令牌以提供更快的通配符和
前缀查询。将此过滤器添加到索引分析器,但不包含
查询分析器。标准的Solr查询解析器将
用来反转通配符和前缀查询以提高
的性能...
它是如何做到的?
由于所有令牌都是通过ReversedWildcardFilterFactory运行的,它是否将所有令牌都反向存储? (这对我来说似乎很愚蠢)
或者,它是否存储所有令牌,通常是和反向令牌,然后大致遍历索引列表两次查询时间? (可能这比搜索使用领先的*要快得多)
部分原因我感到困惑的是,在 schema.xml
来自Solr,它们执行以下操作:
< copyField source =* _ endest = text_en_index/>
< copyField source =* _ endest =text_rev_index/>
其中 text_rev_index
使用 ReversedWildcardFilterFactory
。如果 ReversedWildcardFilterFactory
存储了正向和反向标记,我不知道他们为什么要将这些字段复制到正向和反向 dest
fields。
From https://docs.lucidworks.com/display/lweug/Wildcard+Queries :
Lucid查询解析器将检测何时使用前导通配符,并且
调用反转过滤器(如果存在于索引分析器中)以
反转通配符术语,以便它生成适当的查询
的术语,该术语将匹配存储在该字段的索引
中的反转条款。
The Solr docs say:
solr.ReversedWildcardFilterFactory
A filter that reverses tokens to provide faster leading wildcard and prefix queries. Add this filter to the index analyzer, but not the query analyzer. The standard Solr query parser will use this to reverse wildcard and prefix queries to improve performance...
How does it do that though?
Since all the tokens run through the ReversedWildcardFilterFactory, does it store all the tokens in reverse? (That seems silly to me)
Or, does it store all the tokens normally and the reversed tokens and then run through an index list roughly twice as long when querying? (Presumably that's still much faster than searching using a leading *)
Part of why I'm confused is that in the example schema.xml
from Solr, they do the following:
<copyField source="*_en" dest="text_en_index"/>
<copyField source="*_en" dest="text_rev_index"/>
where text_rev_index
uses a ReversedWildcardFilterFactory
. If the ReversedWildcardFilterFactory
stores both the forward and reversed tokens, I'm not sure why they would copy these fields to both the forward and reversed dest
fields.
From https://docs.lucidworks.com/display/lweug/Wildcard+Queries:
The Lucid query parser will detect when leading wildcards are used and invoke the reversal filter, if present in the index analyzer, to reverse the wildcard term so that it will generate the proper query term that will match the reversed terms that are stored in the index for this field.
这篇关于ReversedWildcardFilterFactory如何加速通配符搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!