Solr错误的拼写检查建议 [英] Wrong spell-check suggestions by Solr

查看:80
本文介绍了Solr错误的拼写检查建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Solr 4.1进行拼写建议。

我们正确配置了它,Solr提供了术语以及整理建议。然而,我们注意到,如果我们再次搜索,建议的单词/整理没有任何结果。

例如,我们搜索术语 confort ,但没有得到任何结果,有两个建议 comfort 转换。第一项包含结果..然而,第二项没有带来任何结果,而是建议再添加两项,所以术语转换不会提供以下建议 - 连接内容。在这里,我们发现连接的结果很少,但内容没有任何结果,并提供了以下建议。即连接 strong>和大陆。在这里我们还发现大陆没有任何结果,并且提示连接

对于许多搜索字词甚至整理也是如此。我们无能为力,这是什么原因造成的?我们可以关闭这样的建议,它不会带来任何结果吗?

我的Solr配置

 < requestHandler name =/ spellclass =solr.SearchHandlerstartup =lazy> 
< str name =df>名称< / str>
< str name =spellcheck.dictionary>默认< / str>
< str name =spellcheck.dictionary> wordbreak< / str>
< str name =spellcheck> on< / str>
< str name =spellcheck.extendedResults> true< / str>
< str name =spellcheck.count> 10< / str>
< str name =spellcheck.alternativeTermCount> 5< / str>
< str name =spellcheck.maxResultsForSuggest> 5< / str>
< str name =spellcheck.collat​​e> true< / str>
< str name =spellcheck.collat​​eExtendedResults> true< / str>
< str name =spellcheck.maxCollat​​ionTries> 10< / str>
< str name =spellcheck.maxCollat​​ions> 5< / str>
< / lst>
< arr name =last-components>
< str>拼写检查< / str>
< / arr>
< / requestHandler>

< searchComponent name =spellcheckclass =solr.SpellCheckComponent>
< str name =queryAnalyzerFieldType>文字< / str>
< lst name =spellchecker>
< str name =name>默认< / str>
< str name =field>名称< / str>
< str name =classname> solr.DirectSolrSpellChecker< / str>
< str name =distanceMeasure>内部< / str>
< float name =accuracy> 0.5< / float>
< int name =maxEdits> 2< / int>
< int name =minPrefix> 1< / int>
< int name =maxInspections> 5< / int>
< int name =minQueryLength> 4< / int>
< float name =maxQueryFrequency> 0.01< / float>
< / lst>

< lst name =spellchecker>
< str name =name> wordbreak< / str>
< str name =classname> solr.WordBreakSolrSpellChecker< / str>
< str name =field>名称< / str>
< str name =combineWords> true< / str>
< str name =breakWords> false< / str>
< int name =maxChanges> 10< / int>
< / lst>
< / searchComponent>

我的模式:

 < fieldType name =textclass =solr.TextFieldpositionIncrementGap =100> 
< analyzer type =index>
< tokenizer class =solr.StandardTokenizerFactory/>
< filter class =solr.StopFilterFactoryignoreCase =truewords =stopwords.txtenablePositionIncrements =true/>
< filter class =solr.LowerCaseFilterFactory/>
< / analyzer>
< analyzer type =query>
< tokenizer class =solr.StandardTokenizerFactory/>
< filter class =solr.StopFilterFactoryignoreCase =truewords =stopwords.txtenablePositionIncrements =true/>
< filter class =solr.SynonymFilterFactorysynonyms同义词=同义词.txtignoreCase =trueexpand =true/>
< filter class =solr.LowerCaseFilterFactory/>
< / analyzer>
< / fieldType>

< field name =Nametype =textindexed =truestored =truerequired =false/>

我的查询: http:// localhost:8983 / solr / mycore /拼写?q = confort& spellcheck = true& Collat​​e = true& spellcheck.extendedResults = true



b

 <回应> 
< lst name =responseHeader>
< int name =status> 0< / int>
< int name =QTime> 16< / int>
< / lst>
< result name =responsenumFound =0start =0/>
< lst name =spellcheck>
< lst name =confort>
< int name =numFound> 2< / int>
< int name =startOffset> 0< / int>
< int name =endOffset> 7< / int>
< int name =origFreq> 0< / int>
< arr name =建议>
< lst>
< str name =word> comfort< / str>
< int name =freq> 6< / int>
< / lst>
< lst>
< str name =word> convert< / str>
< int name =freq> 2< / int>
< / lst>
< / arr>
< / lst>
< bool name =incorrectSpelled> false< / bool>
< / lst>< / lst>
< / response>


解决方案

是否启用搜索条件和拼写检查一样的 ?他们是否进行相同的分析?

其中一个原因可能是字段不同,因此提供的字段建议不存在于正在搜索的字段中。

也,它可能是字段分析不同,因此拼写建议和搜索不匹配。


Working on Spell Suggest with Solr 4.1.

We configured it correctly and Solr offers term as well as collate suggestions. However, we noticed that many times the suggested word / collate doesn't have any results if we search it again.

For example, we searched for term "confort" and got no results, with two suggestions "comfort" and "convert". The first term contains the result.. however the second term doesn't bring any result, and instead suggested two more terms, so term "convert" offers no result with following suggestions - "connect" and "content". Here also, we found that "connect" is having few results but "content" doesn't have any and offered following suggestions.. i.e. "connect" and "continent". Here also we found that "continent" doesn't have any results and it suggested "connect".

The same happens for many search terms and even collate. We're clueless what is causing this? Can we turn off such suggestions which doesn't carry any result?

My Solr Config

<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="df">Name</str>
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck.dictionary">wordbreak</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">true</str>       
      <str name="spellcheck.count">10</str>
      <str name="spellcheck.alternativeTermCount">5</str>
      <str name="spellcheck.maxResultsForSuggest">5</str>       
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.collateExtendedResults">true</str>  
      <str name="spellcheck.maxCollationTries">10</str>
      <str name="spellcheck.maxCollations">5</str>         
    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
</requestHandler>

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">text</str>
<lst name="spellchecker">
  <str name="name">default</str>
  <str name="field">Name</str>
  <str name="classname">solr.DirectSolrSpellChecker</str>
  <str name="distanceMeasure">internal</str>
  <float name="accuracy">0.5</float>
  <int name="maxEdits">2</int>
  <int name="minPrefix">1</int>
  <int name="maxInspections">5</int>
  <int name="minQueryLength">4</int>
  <float name="maxQueryFrequency">0.01</float>
</lst>

<lst name="spellchecker">
  <str name="name">wordbreak</str>
  <str name="classname">solr.WordBreakSolrSpellChecker</str>      
  <str name="field">Name</str>
  <str name="combineWords">true</str>
  <str name="breakWords">false</str>
  <int name="maxChanges">10</int>     
</lst>
</searchComponent> 

My Schema :

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.LowerCaseFilterFactory"/>   
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

<field name="Name" type="text" indexed="true" stored="true"  required="false" />

My Query : http://localhost:8983/solr/mycore/spell?q=confort&spellcheck=true&Collate=true&spellcheck.extendedResults=true

Result :

<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">16</int>
  </lst>
  <result name="response" numFound="0" start="0"/>
  <lst name="spellcheck">
    <lst name="suggestions">
      <lst name="confort">
        <int name="numFound">2</int>
        <int name="startOffset">0</int>
        <int name="endOffset">7</int>
        <int name="origFreq">0</int>
        <arr name="suggestion">
          <lst>
            <str name="word">comfort</str>
            <int name="freq">6</int>
          </lst>
          <lst>
            <str name="word">convert</str>
            <int name="freq">2</int>
          </lst>
        </arr>
      </lst>
      <bool name="correctlySpelled">false</bool>
    </lst></lst>
  </response>

解决方案

Are the terms you search on and the spell check enabled on the same ? do they go under the same analysis ?
One reason can be the fields are different and hence the suggestions on field provided do not exist in the fields that are being searched for.
Also, it can be the fields are analysed differently and hence the spell suggestion and the search does not match.

这篇关于Solr错误的拼写检查建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆