弹性搜索中同一字段上的语言分析器和同义词 [英] Language Analyzers and Synonyms on same field in elasticsearch

查看:43
本文介绍了弹性搜索中同一字段上的语言分析器和同义词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 title description 字段上有一个德语分析器(对我来说很好)

I have a German analyzer on fields title and description (and it work fine for me)

"mappings": {
    "item" : {
      "properties" : {
        "title" : {
          "type" :    "string",
          "analyzer": "german"
        },
        "description" : {
          "type" :   "string",
          "analyzer": "german"
        }
      }
    }
  } 

但是现在我尝试添加同义词.如何在同一字段上添加两个分析仪?

But now I tried to add synonyms. How I can add two analyzers on same field?

推荐答案

您无法为1个字段添加两个分析器.您可以做的是描述自定义分析器,该分析器使用内部的同义词过滤器和德语特定的过滤器以及所需的令牌生成器,因此基本上您需要以自定义的方式混合所需的所有内容.

You couldn't add two analyzers for 1 field. What you could do is to describe custom analyzer, which use synonyms filter inside and german specifics filter combined with needed tokenizer, so basically you need to mix everything you need in a custom way.

可以想象这样的事情(一个非常粗糙的例子):

One could imagine something like this (a very rough example):

PUT /my_index
    {
      "settings": {
        "analysis": {
          "filter": {
            "german_stop": {
              "type":       "stop",
              "stopwords":  "_german_" 
            },
            "german_stemmer": {
              "type":       "stemmer",
              "language":   "light_german"
            },
            "my_synonyms": {
              "type": "synonym", 
              "synonyms": [ 
                "british,english",
                "queen,monarch"
              ]
            }
          },
          "analyzer": {
            "german": {
              "tokenizer":  "standard",
              "filter": [
                "lowercase",
                "german_stop",
                "my_synonyms",
                "german_normalization",
                "german_stemmer"
              ]
            }
          }
        }
      }
    }

在过滤器链中,您需要指定要包括的所有过滤器-词干提取器,同义词,停用词,小写字母等(还请记住,这很重要),并在映射中使用它,就像您在问题中所描述的那样.

In a filter chain, you need to specify all filters you want to include - stemmers, synonyms, stop words, lowercase, etc, etc. (also keep in mind, that order is matters), and the use it inside mappings, as you described in your question.

稍后您可以通过运行

GET /_analyze
{
  "analyzer": "german",
  "text": "Darf ich mit Bargeld bezahlen?"
}

这篇关于弹性搜索中同一字段上的语言分析器和同义词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆