如何在Elasticsearch中获得同义词匹配的自动建议 [英] How can I get auto-suggestions for synonyms match in elasticsearch

查看:93
本文介绍了如何在Elasticsearch中获得同义词匹配的自动建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用下面的代码,当我键入"cu"

I'm using the code below and it does not give auto-suggestion as curd when i type "cu"

但是它确实与正确的酸奶相匹配.如何获得同义词的自动完成功能和相同的文档匹配功能?

But it does match the document with yogurt which is correct. How can I get both auto-complete for synonym words and document match for the same?

PUT products
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "synonym_analyzer": {
            "tokenizer": "standard",
            "filter": [
            "lowercase",
              "synonym_graph"
            ]
          }
        },
        "filter": {
          "synonym_graph": {
            "type": "synonym_graph",
            "synonyms": [
               "yogurt, curd, dahi"
            ]
          }
        }
      }
    }
  }
}

PUT products/_mapping
{
  "properties": {
    "description": {
      "type": "text",
      "analyzer": "synonym_analyzer"
    }
  }
}

POST products/_doc
{
  "description": "yogurt"
}

GET products/_search
{
  "query": {
    "match": {
      "description": "cu"
    }
  }
}

推荐答案

当您在 synonym_graph 过滤器中提供同义词列表时,它仅表示ES将可互换地对待任何同义词.但是,通过 标准分析器,仅全字令牌

When you provide a list of synonyms in a synonym_graph filter it simply means that ES will treat any of the synonyms interchangeably. But when they're analyzed via the standard analyzer, only full-word tokens will be produced:

POST products/_analyze?filter_path=tokens.token
{
  "text": "yogurt",
  "field": "description"
}

收益:

{
  "tokens" : [
    {
      "token" : "curd"
    },
    {
      "token" : "dahi"
    },
    {
      "token" : "yogurt"
    }
  ]
}

因此,常规的 match_query 不会在此处剪切它,因为标准分析器没有提供可匹配子字符串方面的足够上下文(

As such, a regular match_query won't cut it here because the standard analyzer hasn't provided it with enough context in terms of matchable substrings (n-grams).

同时,您可以将 match 替换为

In the meantime you can replace match with match_phrase_prefix which does exactly what you're after -- match an ordered sequence of characters while taking into account the synonyms:

GET products/_search
{
  "query": {
    "match_phrase_prefix": {
      "description": "cu"
    }
  }
}

但是,正如查询名称所暗示的那样,这仅适用于前缀.如果您希望自动完成功能提示子字符串匹配出现的位置,请查看我的其他答案,我在

But that, as the query name suggests, is only going to work for prefixes. If you fancy an autocomplete that suggests terms regardless of where the substring matches occur, have a look at my other answer where I talk about leveraging n-grams.

这篇关于如何在Elasticsearch中获得同义词匹配的自动建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆