如何在Elasticsearch中获得同义词匹配的自动建议 [英] How can I get auto-suggestions for synonyms match in elasticsearch
问题描述
我正在使用下面的代码,当我键入"cu"
I'm using the code below and it does not give auto-suggestion as curd when i type "cu"
但是它确实与正确的酸奶相匹配.如何获得同义词的自动完成功能和相同的文档匹配功能?
But it does match the document with yogurt which is correct. How can I get both auto-complete for synonym words and document match for the same?
PUT products
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"synonym_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"synonym_graph"
]
}
},
"filter": {
"synonym_graph": {
"type": "synonym_graph",
"synonyms": [
"yogurt, curd, dahi"
]
}
}
}
}
}
}
PUT products/_mapping
{
"properties": {
"description": {
"type": "text",
"analyzer": "synonym_analyzer"
}
}
}
POST products/_doc
{
"description": "yogurt"
}
GET products/_search
{
"query": {
"match": {
"description": "cu"
}
}
}
推荐答案
当您在 synonym_graph
过滤器中提供同义词列表时,它仅表示ES将可互换地对待任何同义词.但是,通过 标准
分析器,仅全字令牌
When you provide a list of synonyms in a synonym_graph
filter it simply means that ES will treat any of the synonyms interchangeably. But when they're analyzed via the standard
analyzer, only full-word tokens will be produced:
POST products/_analyze?filter_path=tokens.token
{
"text": "yogurt",
"field": "description"
}
收益:
{
"tokens" : [
{
"token" : "curd"
},
{
"token" : "dahi"
},
{
"token" : "yogurt"
}
]
}
因此,常规的 match_query
不会在此处剪切它,因为标准分析器没有提供可匹配子字符串方面的足够上下文(
As such, a regular match_query
won't cut it here because the standard analyzer hasn't provided it with enough context in terms of matchable substrings (n-grams).
In the meantime you can replace match
with match_phrase_prefix
which does exactly what you're after -- match an ordered sequence of characters while taking into account the synonyms:
GET products/_search
{
"query": {
"match_phrase_prefix": {
"description": "cu"
}
}
}
但是,正如查询名称所暗示的那样,这仅适用于前缀.如果您希望自动完成功能提示子字符串匹配出现的位置,请查看我的其他答案,我在
But that, as the query name suggests, is only going to work for prefixes. If you fancy an autocomplete that suggests terms regardless of where the substring matches occur, have a look at my other answer where I talk about leveraging n-grams.
这篇关于如何在Elasticsearch中获得同义词匹配的自动建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!