Lucene/Solr-查询分析正常,但是选择处理程序不起作用 [英] Lucene/Solr - Query Analysis working, but Select handler not

查看:81
本文介绍了Lucene/Solr-查询分析正常,但是选择处理程序不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Solr设置有问题.而不是在选择处理程序中搜索"canaDa",就像在"canada"中一样.

I have an issue with my solr settings. its NOT searching for "canaDa" in select handler as it is for "canada".

这是字段类型text_en_splitting的架构(它们都很重要):

here is the schema for fieldtype text_en_splitting (they all are important):

<fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" enablePositionIncrements="true"/>
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>

  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" preserveOriginal="1" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
</fieldType>

.

这是选择处理程序的solrconfig设置:

Here is the solrconfig settings for select handler:

<requestHandler name="/select" class="solr.SearchHandler">
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">20</int>
       <str name="df">text</str>

       <str name="defType">edismax</str>
       <str name="qf">court_id^0.1 jurisdiction^1.0 jur_code^0.5 court_name^1.5 court_code^0.5 court_type^1.0</str>
       <str name="mm">80%</str>
       <str name="q.alt">*:*</str>
       <str name="fl">*</str>
     </lst>

.

这是solr admin的查询分析工具:

Here is the Query Analysis tool of solr admin: .

如您所见,查询分析确实将其破坏为"canaDa",但搜索找不到它...

As you can see, the Query Analysis did break it for "canaDa", but the search cant find it...

推荐答案

基于text_en_splitting fieldType的配置方式,您在此处看到的行为是正确的.使用此配置,"canaDa"将匹配的唯一方法是,如果索引项也是"canaDa",则b/c会将它们都分为"cana"和"da".如果您希望"canaDa"与"canada"匹配,那么我建议您删除WordDelimiterFilterFactory中的splitOnCaseChange=1选项,因为这就是导致此问题的原因.

The behavior you are seeing here is correct based on the way that the text_en_splitting fieldType is configured. With this configuration the only way that "canaDa" is going to match is if the indexed term is also "canaDa", b/c that way they will both be split into "cana" and "da". If you want "canaDa" to match "canada" then I would suggest you remove the splitOnCaseChange=1 option in the WordDelimiterFilterFactory as this is what is causing the issue here.

如果无法删除splitOnCaseChange设置,可以在问题中更详细地说明您的要求和预期行为,以便我们帮助您找到可行的解决方案.

If removing the splitOnCaseChange setting is not an option, can you explain your requirements and expected behavior in more detail in the question so we can help you find a workable solution.

这篇关于Lucene/Solr-查询分析正常,但是选择处理程序不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆