elasticsearch为同义词/词干定制分数 [英] elasticsearch customize score for synonyms/stemming

查看:275
本文介绍了elasticsearch为同义词/词干定制分数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用elasticsearch 1.1.2。



我在可搜索的字段上使用具有不同权重的多重查询。



示例:



{
multi_match:{
query:这是一个测试,
fields:[title ^ 3,description ^ 2,body]
}
}



示例标题是身体的三倍重要。



我想根据发现的匹配自定义每个字段的重量。



我想说我搜索伤害,我想:



- 标题的一个系数3如果找到完全匹配的话:title包含单词inj。



- 如果发现同义词,则标题系数为2:标题包含单词挫伤。



- 如果发现词干,则标题系数为1:title包含单词injuries。



有没有办法做这样的定制?



谢谢!

解决方案

您可以使用多个-fields 映射在您的标题值。



它允许您映射几个类型,所以使用不同的分析仪,输入相同的值。



假设您已经为同义词和词干定义了自定义分析器,请尝试更新您的映射:

  PUT /< index_name> /< type_name> / _ mapping 
{
< type>:{
properties:{
title:{
type:string,
fields:{
exact:{
type:string,
index:not_analyzed
},
同义词:{
type:string,
index:analyze,
analyzer:synonym_analyzer
},
stemmed:{
type:string,
index:analyze,
analyzer:stemming_analyzer
}
}
}
}
}
}
pre>

以下查询应符合您的要求:

  POST /< index_name> /< type_name> / _ search 
{
query:{
multi_match:{
query:inj,
fields:[
title.exact ^ 3,
title.synonym ^ 2,
title.stemmed
]
}
}
}


I am using elasticsearch 1.1.2.

I am using multimatch query with different weights on the searchable fields.

Example:

{ "multi_match" : { "query" : "this is a test", "fields" : [ "title^3", "description^2", "body" ] } }

So here in my example title is three times as important as the body.

I would like to customize the weight given for each field depending on the match found.

Let's say I search for "injury", I want to:

-Give the title a coefficient of 3 if an exact match is found: title contains the word "injury".

-Give the title a coefficient of 2 if a synonym is found: title contains the word "bruise".

-Give the title a coefficient of 1 if a stemming is found : title contains the word "injuries".

Is there a way to do this kind of customization ?

Thanks!

解决方案

You can achieve that by using a multi-fields mapping on your title value.

It allows you to map several types, and so to use different analyzers, to the same input value.

Assuming you have defined custom analyzers for both synonym and stemming, try to update your mapping :

PUT /<index_name>/<type_name>/_mapping
{
  "<type>": {
    "properties": {
      "title": {
        "type": "string",
        "fields": {
          "exact": {
            "type": "string",
            "index": "not_analyzed"
          },          
          "synonym": {
            "type": "string",
            "index": "analyzed",
            "analyzer": "synonym_analyzer"
          },
          "stemmed": {
            "type": "string",
            "index": "analyzed",
            "analyzer": "stemming_analyzer"
          }
        }
      }
    }
  }
}

And the following query should match as you wish :

POST /<index_name>/<type_name>/_search
{
  "query": {
    "multi_match": {
      "query": "injury",
      "fields": [
        "title.exact^3",
        "title.synonym^2",
        "title.stemmed"
      ]
    }
  }
}

这篇关于elasticsearch为同义词/词干定制分数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆