中间部分匹配建议弹性搜索 [英] partial matching from middle in completion suggestion elasticsearch

查看:70
本文介绍了中间部分匹配建议弹性搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为 search_suggest 的字段,其内容如下

I have a field named search_suggest having the below

search_suggest: {
   type: "completion",
   analyzer: "simple",
   payloads: true,
   preserve_separators: false,
   preserve_position_increments: false,
  max_input_length: 50
}

它的值索引为

{
  input: [
   "apple iphone 6"
  ],
  output: "apple iphone 6",
  weight: 5,
  payload: {
   category: "mobiles"
  }
}

如果我搜索苹果,它会给我结果.但是,如果我搜索iphone,它不会给我任何结果.

If I searched for apple ,It is giving me results. But If I search for iphone it is not giving me any results.

完成建议程序有什么办法吗?我是否必须将输入索引为

Is there any way in completion suggester to do this?. Do i have to index input as

  • 苹果iphone 6
  • iphone 6
  • 6

我知道edge-ngram提示器.但是缺点是它也会建议重复.

I am aware of edge-ngram suggester. But the cons is it will suggest duplicates also.

请帮助.

推荐答案

如果有人仍在寻找答案,

If anyone still looking for answers,

完成建议者适合前缀匹配.因此,在输入中,您可以提供短语的可能后缀.即使您从中间(又名子字符串搜索)开始,这也使您可以进行前缀搜索.

Completion suggester is suitable for prefix matches. So, in input, you can provide the possible suffixes of your phrase. This will allow you to do prefix searches even if you start from the middle, aka sub string searches.

例如:

{
  "text" : "Courtyard by Marriot Munich City",
  "text_suggest" : {
    "input": [
      "Courtyard by Marriot Munich City",
      "by Marriot Munich City",
      "Marriot Munich City",
      "Munich City",
      "City"
    ],
    "output" : "Courtyard by Marriot Munich City",
    "weight" : 11,
    "payload": { "id" : 314159 }
  }
}

如您所见,无论您在慕尼黑市万豪酒店(Marriott Munich City)的庭院"中从哪里开始,您都将获得结果.("by"可能不适用,因为在大多数情况下,它将作为停用词丢弃.)

As you can see, wherever you start within "Courtyard by Marriot Munich City" you will get results. (Except may be for "by" because in most cases it will be discarded as a stop word).

对于一般的搜索字符串,只需执行4-5个步骤就足够了.另外,如果您使用过滤器处理停用词,则无需担心输入中的停用词.

Going up to 4-5 steps are well enough for general search strings. Also, if you handle stop-words with a filter, no need to worry about stop words in input.

样本索引分析器

{
  "settings" : {
    "analysis" : {
      "filter" : {
        "suggester_stop" : {
          "type" : "stop",
          "stopwords" : "_english_",
          "remove_trailing" : false,
          "ignore_case" : true
        },
        "suggester_stemmer" : {
          "type" : "stemmer",
          "name" : "light_english"
        }
      },
      "analyzer" : {
        "suggester_analyzer" : {
          "type" : "custom",
          "tokenizer" : "standard",
          "char_filter" : ["html_strip"],
          "filter" : [
            "standard",
            "lowercase",
            "suggester_stop",
            "suggester_stemmer"
          ]
        }
      }
    }
  }
}

这将解决您在评论之一中提到的问题:

This will solve the problem you mentioned in one of your comments:

然后,如果我建议使用"apple ip",它不会给出结果.iPhone 6怎么样?

Then if I suggest for "apple ip", It won't give result. How about iphone 6?

{
  "text_suggest" : {
    "input": [
      "apple iphone 6",
      "iphone 6"
    ],
    "output" : "apple iphone 6",
    "weight" : 11
  }
}

您将同时获得"apple ip","iphone 6"等的搜索结果.但是,您将不会获得"apple 6"的搜索结果,无论如何,这对人们来说并不常见.

You will get search results for both "apple ip", "iphone 6" etc. However you will not get result for "apple 6" which is not that common for people to search anyway.

这篇关于中间部分匹配建议弹性搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆