弹性搜索中的自定义排序 [英] Custom sorting in elastic search

查看:90
本文介绍了弹性搜索中的自定义排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些文件在弹性搜索与完成建议。我搜索一些像Stack这样的值,结果按照下面的顺序显示:


  1. 堆栈溢出

  2. 堆栈溢出

  3. 堆栈

  4. StackOver

  5. StackOverflow

我希望结果按照顺序显示:


  1. / li>
  2. StackOver

  3. StackOverflow

  4. 堆栈溢出

  5. 堆栈溢出

即,精确匹配应该先来,而不是结果哪个空格或特殊字符。
TIA

解决方案

这一切都取决于您正在分析您正在查询的字符串的方式。我建议您在同一个字符串字段上应用多个分析器。以下是您希望自动完成/建议功能的名称字段的映射示例:

 name :{
type:string,
analyzer:keyword_analyzer,
fields:{
name_ac:{
type :string,
index_analyzer:string_autocomplete_analyzer,
search_analyzer:keyword_analyzer
}
}
}

这里,keyword_analyzer和string_autocomplete_analyzer是您的索引设置中定义的分析器。以下是一个例子:

 keyword_analyzer:{
type:custom,
filter:[
smallcase
],
tokenizer:keyword
}

string_autocomplete_analyzer:{
type:custom,
filter:[
smallcase

autocomplete
],
tokenizer :空白
}

这里的自动填充是一个分析过滤器:

 autocomplete:{
type:edgeNGram,
min_gram:1,
max_gram:10
}

设置完毕后,在Elasticsearch的自动建议中,您可以使用multiMatch查询,而不是正常匹配查询,并在此处为multiMatch中的各个字段提供增强功能。以下是java中的一个例子:

  QueryBuilders.multiMatchQuery(yourSearchString,name ^ 3,name_ac); 

您可能需要根据需要更改boost(^ 3)。



如果即使这样也不能满足您的要求,您可以查看另一个分析器,根据第一个字分析字符串,并在multiMatch中包含该字段。以下是这种分析器的一个例子:

 first_word_name_analyzer:{
type:custom
filter:[
smallcase

whitespace_merge

edgengram
],
tokenizer:关键字
}

使用这些分析过滤器:

 whitespace_merge:{
pattern:\s +,
type:pattern_replace ,
replacement:
},
edgengram:{
type:edgeNGram,
min_gram:1
max_gram:32
}

您可能需要一些关于boost值的试验,以便根据您的要求达到最佳结果。希望这可以帮助。


I have some documents in elastic search with completion suggester. I search for some value like Stack, the results are shown in the order given below:

  1. Stack Overflow
  2. Stack-Overflow
  3. Stack
  4. StackOver
  5. StackOverflow

I want the result to be displayed in the order:

  1. Stack
  2. StackOver
  3. StackOverflow
  4. Stack Overflow
  5. Stack-Overflow

i.e, the exacts matches should come first instead of results which space or special characters. TIA

解决方案

It all depends on the way you are analysing the string you are querying upon. I will suggest that you apply more than one analyser on the same string field. Below is an example of the mapping of the "name" field over which you want auto complete/suggester feature:

"name": {
    "type": "string",
    "analyzer": "keyword_analyzer",
    "fields": {
        "name_ac": {
            "type": "string",
            "index_analyzer": "string_autocomplete_analyzer",
            "search_analyzer": "keyword_analyzer"
        }
    }
}

Here, keyword_analyzer and string_autocomplete_analyzer are analysers defined in your index settings. Below is an example:

"keyword_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
    ],
    "tokenizer": "keyword"
}

"string_autocomplete_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
        ,
        "autocomplete"
    ],
    "tokenizer": "whitespace"
}

Here autocomplete is an analysis filter:

"autocomplete": {
    "type": "edgeNGram",
    "min_gram": "1",
    "max_gram": "10"
}

After having set this, when searching in Elasticsearch for the auto suggestions, you can make use of multiMatch queries instead of the normal match queries and here you provide boosts to individual fields in the multiMatch. Below is a example in java:

QueryBuilders.multiMatchQuery(yourSearchString,"name^3","name_ac");

You may need to alter the boost (^3) as per your needs.

If even this does not satisfy your requirements, you can look at having one more analyser which analyse the string based on first word and include that field in the multiMatch. Below is an example of such an analyser:

"first_word_name_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
        ,
        "whitespace_merge"
        ,
        "edgengram"
    ],
    "tokenizer": "keyword"
}

With these analysis filters:

"whitespace_merge": {
    "pattern": "\s+",
    "type": "pattern_replace",
    "replacement": " "
},
"edgengram": {
    "type": "edgeNGram",
    "min_gram": "1",
    "max_gram": "32"
}

You may have to do some trials on the boost values in order to reach the most optimum results based on your requirements. Hope this helps.

这篇关于弹性搜索中的自定义排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆