ElasticSearch:突出显示短语查询中的每个单词 [英] ElasticSearch: Highlights every word in phrase query

查看:424
本文介绍了ElasticSearch:突出显示短语查询中的每个单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何获得弹性搜索仅突出显示导致文档返回的文字?



我有以下索引

  {
mappings:{
document:{
properties:{
content :{
type:string,
fields:{
english:{
type:string,
:english
}
}
}
}
}
}
}

让我们索引:


核能是使用核反应释放核能
的能源[5]来产生热量,最常用于在bbb蒸汽涡轮机中在核电站发电。
术语包括核裂变,核衰变和核聚变。
目前,周期表中锕系元素系列元素的核裂变在人类的b $ b直接服务中产生绝大多数核能,核衰减过程主要是
以地热能的形式,放射性同位素热电
发电机,在利基组合休息。


搜索对于核素~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~现在突出显示。



这是我的查询,如果它有帮助:

  {
fields:[
],
query:{
query_string:{
query:\nuclear elements\ 〜2,
fields:[
content.english
]
}
},
highlight:{
pre_tags:[
< em class ='h'>
],
post_tags:[
< / em>
],
fragment_size:500,
number_of_fragments:20,
fields:{
content.english:{}
}
}
}


解决方案

ES 2.1中有一个突出显示错误,这是由于此更改。此提取请求已解决此问题。



根据ES开发人员


这是一个我在#13239中引入的错误,同时认为
的差异是由于Lucene的变化:extractUnknownQuery也是
,当span提取已经成功时,所以我们应该只将
退回到Weight.extractTerms,如果没有提取任何span。


它在旧版本中工作,直到2.0,并且可以在将来的版本中按预期工作。


How can I get Elastic Search to only highlight words that caused the document to be returned?

I have the following index

{
  "mappings": {
    "document": {
      "properties": {
        "content": {
          "type": "string",
          "fields": {
            "english": {
              "type": "string",
              "analyzer": "english"
            }
          }
        }
      }
    }
  }
}

Let say I have indexed:

Nuclear power is the use of nuclear reactions that release nuclear energy[5] to generate heat, which most frequently is then used in steam turbines to produce electricity in a nuclear power station. The term includes nuclear fission, nuclear decay and nuclear fusion. Presently, the nuclear fission of elements in the actinide series of the periodic table produce the vast majority of nuclear energy in the direct service of humankind, with nuclear decay processes, primarily in the form of geothermal energy, and radioisotope thermoelectric generators, in niche uses making up the rest.

And search for "nuclear elements"~2

I only want "nuclear fission of elements" or parts of "nuclear fission of elements" to be highlighted but every single occurrence of nuclear is now highlighted.

This is my query if it helps:

{
  "fields": [
  ],
  "query": {
    "query_string": {
      "query": "\"nuclear elements\"~2",
      "fields": [
        "content.english"
      ]
    }
  },
  "highlight": {
    "pre_tags": [
      "<em class='h'>"
    ],
    "post_tags": [
      "</em>"
    ],
    "fragment_size": 500,
    "number_of_fragments": 20,
    "fields": {
      "content.english": {}
    }
  }
} 

解决方案

There is a highlighting bug in ES 2.1, which was caused due to this change. This has been fixed by this Pull Request.

According to ES developer

This is a bug that I introduced in #13239 while thinking that the differences were due to changes in Lucene: extractUnknownQuery is also called when span extraction already succeeded, so we should only fall back to Weight.extractTerms if no spans have been extracted yet.

It works in older versions till 2.0 and would work as expected in future versions.

这篇关于ElasticSearch:突出显示短语查询中的每个单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆