是Solr的SuggestComponent能够返回瓦,而不是整场值? [英] Is Solr SuggestComponent able to return shingles instead of whole field values?

查看:247
本文介绍了是Solr的SuggestComponent能够返回瓦,而不是整场值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Solr的5.0.0,并想创建一个自动填充功能生成从我的文档的字克(或带状疱疹)的建议。
问题是,在提出查询的回报,我只得到完整的搜索领域的条款,这可能是长期的极端

I use solr 5.0.0 and want to create an autocomplete functionality generating suggestions from the word-grams (or shingles) of my documents. The problem is that in return of a suggest-query I only get complete "terms" of the search field which can be extremly long.

目前存在的问题:

输入:如此
建议:
......的极端长文本的所以 N久文本继续......

Input:"so" Suggestions: "......extremly long text son long text continuing......"

......下一个长文本的所以 LAR下一个文本继续......

"......next long text solar next text continuing......"

目标:

输入:所以

带状疱疹建议:

所以 N

所以 LAR

所以 LAR测试

<searchComponent name="suggest" class="solr.SuggestComponent" 
               enable="${solr.suggester.enabled:true}"     >
<lst name="suggester">
  <str name="name">mySuggester</str>
  <str name="lookupImpl">AnalyzingInfixLookupFactory</str>      
  <str name="dictionaryImpl">DocumentDictionaryFactory</str>
  <str name="field">title_and_description_suggest</str>
  <str name="weightField">price</str>
  <str name="suggestAnalyzerFieldType">autocomplete</str>
  <str name="queryAnalyzerFieldType">autocomplete</str>
 <str name="buildOnCommit">true</str>
</lst>

schema.xml中:

schema.xml:

<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball"/>
      <filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="true" outputUnigramsIfNoShingles="true"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    </analyzer>
</fieldType>

我要回最多3个字作为自动完成任期。这可能与SuggestComponent或者你会怎么做呢?不管我怎么努力,我总是收到匹配的文档的完整的字段值。

I want to return max 3 words as autocomplete term. Is this possible with the SuggestComponent or how would you do it? No matter what I try I always receive the complete field value of matching documents.

是预期的行为,或者我做了什么错了?

Is that expected behaviour or what did I do wrong?

提前感谢

推荐答案

schema.xml中定义的字段类型如下:

 <fieldType name="text_autocomplete" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="5"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.KeywordTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>

schema.xml中定义字段,如下所示:

<field name="example_field" type="text_autocomplete" indexed="true" stored="true"/>

写您的查询,如下所示:

query?q=*&
rows=0&
facet=true&
facet.field=example_field&
facet.limit=-1&
wt=json&
indent=true&
facet.prefix=so

在小面。preFIX字段中,指定所搜索的术语要为其建议('这样',在本实施例)。如果您需要在建议少于5个字,因此降低maxShingleSize在字段类型定义。默认情况下,你会得到的结果降低其发生频率的顺序。

In the facet.prefix field, specify the term being searched for which you want suggestions ('so', in this example). If you need less than 5 words in the suggestion, reduce maxShingleSize in the fieldType definition accordingly. By default, you will get the results in decreasing order of their frequency of occurrence.

这篇关于是Solr的SuggestComponent能够返回瓦,而不是整场值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆