当查询包含多个术语时,如何在整个短语上使用 Solr 自动完成? [英] How to have Solr autocomplete on whole phrase when query contains multiple terms?

查看:24
本文介绍了当查询包含多个术语时,如何在整个短语上使用 Solr 自动完成?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里查看了大量示例和其他问题,从他们那里,我的配置非常接近我需要的配置,但是我错过了最后一点锻炼的时间.我正在搜索以下值:

I've looked through a ton of examples and other questions here and from them, I've got my config very close to what I need but I'm missing one last little bit that I'm having a heck of a time working out. I'm searching on values like:

solar powered
solar glass
solar globe
solar lights
solar magic
solid brass
solid copper

我想要的:

  1. 如果我搜索 sol,结果应该包括所有这些值.这行得通.
  2. 如果我搜索 solar,我应该只得到前五个.这行得通.
  3. 如果我搜索 solar gl,我应该只得到 solar glasssolar global.这不起作用.相反,我得到一组 solar 的匹配项和第二组 gl 的匹配项.
  1. If I search for sol the result should include all these values. This works.
  2. If I search for solar I should get just the first five. This works.
  3. If I search for solar gl I should get only solar glass and solar globe. This does not work. Instead, I get one set of matches for solar and a second set of matches for gl.

简而言之,我想将输入字符串视为一个整体,而不考虑任何空格.我认为这是通过创建一个单独的查询(相对于索引)分析器来完成的,但我一直无法让它工作.任何人都可以建议一种配置,可以让我得到我正在寻找的东西?

In a nutshell, I want to consider the input string as a whole, regardless of any whitespace. I gather this is accomplished by creating a separate query (versus index) analyzer, but I've not been able to make it work. Can anyone suggest a configuration that will get me what I'm looking for?

我(未成功)尝试过:

  • 使用solar gl"进行查询
  • 使用mm=100%
  • 查询
  • 使用 KeywordTokenizerFactory 定义单独的查询和索引分析器.(我不知道我认为这会做什么.)
  • 定义索引分析器而不是查询分析器.
  • 定义一个没有分词器的查询分析器.

这是我当前的架构:

<field name="suggest_phrase" type="suggest_phrase"
    indexed="true" stored="false" multiValued="false" />

和字段定义:

<fieldType name="suggest_phrase" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
    </analyzer>
</fieldType>

和配置:

<searchComponent name="suggest_phrase" class="solr.SpellCheckComponent">
    <lst name="spellchecker">
        <str name="name">suggest_phrase</str>
        <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
        <str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup</str>
        <str name="field">suggest_phrase</str>
        <str name="buildOnCommit">true</str>
    </lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest_phrase">
    <lst name="defaults">
        <str name="spellcheck">true</str>
        <str name="spellcheck.dictionary">suggest_phrase</str>
        <str name="spellcheck.onlyMorePopular">true</str>
        <str name="spellcheck.count">10</str>
        <str name="spellcheck.collate">false</str>
    </lst>
    <arr name="components">
        <str>suggest_phrase</str>
    </arr>
</requestHandler>

推荐答案

终于找到答案了!我知道我真的很接近.结果证明我上面的配置是正确的,我只需要更改我的查询.

Found the answer, finally! I knew I was really close. Turns out my configuration above was correct and I simply needed to change my query.

  1. 使用 KeywordTokenizerFactory 以便将字符串作为一个整体编入索引.
  2. 使用 SpellCheckComponent 作为请求处理程序.
  3. 我遗漏的部分——不要用 q= 查询,而是用 spellcheck.q=.
  1. Use KeywordTokenizerFactory so that the strings get indexed as a whole.
  2. Use SpellCheckComponent for the request handler.
  3. The piece I was missing -- don't query with q=<string> but with spellcheck.q=<string>.

给定上面提到的源字符串和 spellcheck.q=solar+gl 的查询,这会产生所需的结果:

Given the source strings noted above and a query of spellcheck.q=solar+gl this yields the desired results:

solar glass
solar globe

这篇关于当查询包含多个术语时,如何在整个短语上使用 Solr 自动完成?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆