基于Elasticsearch自定义文件的分析器 [英] Elasticsearch custom file based analyzer

查看:71
本文介绍了基于Elasticsearch自定义文件的分析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Elasticsearch 6,并且索引中有一个全文本字段.此字段存储产品类别,并且只能采用少数几个可能的值之一(例如水果,多叶蔬菜等).我想使用文件中指定的标记以自定义方式分析字段.
例如

I am using elasticsearch 6 and have a full text field in the index. This field stores the category of a product and can take only one of few possible values(e.g fruit, leafy vegetables etc). I want to analyse the field in a custom way with tokens specified in a file.
E.g

新鲜水果-> [水果,新鲜水果]

fresh fruit->[fruit, fresh fruit]

是否可以使用自定义分析器以及来自如上所述的映射文件的最终令牌.

Is there a way to use a custom analyser with final tokens coming from a mapping file as above.

推荐答案

您正在寻找的是

What you are looking for is synonym token filter. You need to create a custom analyser which used this filter so that when input string is fresh fruit or fruit then it generates single token fruit. You can achieve this by defining analysis in index settings. Create a custom analyser that uses the filter and then apply that analyser to the category field as below:

PUT my_index
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "my_synonym_analyzer": {
            "tokenizer": "standard",
            "filter": [
              "lowercase",
              "custom_synonym"
            ]
          }
        },
        "filter": {
          "custom_synonym": {
            "type": "synonym",
            "synonyms": [
              "fresh fruit, fruit => fruit"
            ],
            //"synonyms_path": "analysis/synonyms.txt"    <---- replace "synonyms" above with this to use file instead of array of synonyms
          }
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "category": {
          "type": "text",
          "analyzer": "my_synonym_analyzer"
        }
      }
    }
  }
}

现在,当您针对具有 fruit 新鲜水果会匹配.这是因为默认情况下,当针对字段进行搜索时,弹性搜索对在字段上应用的搜索字符串使用相同的分析器,在这种情况下,这两种情况最终都会归结为 fruit ( fruit fresh fruit ),因此文档将匹配.

Now when you search for either fruit or fresh fruit against category field documents with either fruit or fresh fruit will match. This is because when searching against a field elastic search by default uses same analyzer on the search string that was applied on the the field while indexing, which in this case will eventually boils down to fruit in either case (fruit or fresh fruit) and hence the documents will match.

这篇关于基于Elasticsearch自定义文件的分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆