ReversedWildcardFilterFactory如何加速通配符搜索? [英] How does ReversedWildcardFilterFactory speed up wildcard searches?

查看:459
本文介绍了ReversedWildcardFilterFactory如何加速通配符搜索?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Solr 文档说:


solr.ReversedWildcardFilterFactory

一种过滤器,用于反转令牌以提供更快的通配符和
前缀查询。将此过滤器添加到索引分析器,但不包含
查询分析器。标准的Solr查询解析器将
用来反转通配符和前缀查询以提高
的性能...

它是如何做到的?

由于所有令牌都是通过ReversedWildcardFilterFactory运行的,它是否将所有令牌都反向存储? (这对我来说似乎很愚蠢)

或者,它是否存储所有令牌,通常是反向令牌,然后大致遍历索引列表两次查询时间? (可能这比搜索使用领先的*要快得多)

部分原因我感到困惑的是,在 schema.xml 来自Solr,它们执行以下操作:

 < copyField source =* _ endest = text_en_index/> 
< copyField source =* _ endest =text_rev_index/>

其中 text_rev_index 使用 ReversedWildcardFilterFactory 。如果 ReversedWildcardFilterFactory 存储了正向和反向标记,我不知道他们为什么要将这些字段复制到正向和反向 dest fields。

解决方案

From https://docs.lucidworks.com/display/lweug/Wildcard+Queries


Lucid查询解析器将检测何时使用前导通配符,并且
调用反转过滤器(如果存在于索引分析器中)以
反转通配符术语,以便它生成适当的查询
的术语,该术语将匹配存储在该字段的索引
中的反转条款。



The Solr docs say:

solr.ReversedWildcardFilterFactory

A filter that reverses tokens to provide faster leading wildcard and prefix queries. Add this filter to the index analyzer, but not the query analyzer. The standard Solr query parser will use this to reverse wildcard and prefix queries to improve performance...

How does it do that though?

Since all the tokens run through the ReversedWildcardFilterFactory, does it store all the tokens in reverse? (That seems silly to me)

Or, does it store all the tokens normally and the reversed tokens and then run through an index list roughly twice as long when querying? (Presumably that's still much faster than searching using a leading *)

Part of why I'm confused is that in the example schema.xml from Solr, they do the following:

<copyField source="*_en" dest="text_en_index"/>
<copyField source="*_en" dest="text_rev_index"/>

where text_rev_index uses a ReversedWildcardFilterFactory. If the ReversedWildcardFilterFactory stores both the forward and reversed tokens, I'm not sure why they would copy these fields to both the forward and reversed dest fields.

解决方案

From https://docs.lucidworks.com/display/lweug/Wildcard+Queries:

The Lucid query parser will detect when leading wildcards are used and invoke the reversal filter, if present in the index analyzer, to reverse the wildcard term so that it will generate the proper query term that will match the reversed terms that are stored in the index for this field.

这篇关于ReversedWildcardFilterFactory如何加速通配符搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆