弹性搜索结果建议使用多个单词输入进行搜索 [英] Elasticsearch completion suggest search with multiple-word inputs

查看:176
本文介绍了弹性搜索结果建议使用多个单词输入进行搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

示例结构:

使用弹性搜索完成建议,我有一些问题返回与单字查询匹配的多字输入建议。

Example structure:

PUT /test_index/
{
   "mappings": {
      "item": {
         "properties": {
            "test_suggest": {
               "type": "completion",
               "index_analyzer": "whitespace",
               "search_analyzer": "whitespace",
               "payloads": false
            }
         }
      }
   }
}

PUT /test_index/item/1
{
   "test_suggest": {
      "input": [
         "cat dog",
         "elephant"
      ]
   }
}

工作查询:

POST /test_index/_suggest
{
    "test_suggest":{
        "text":"cat",
        "completion": {
            "field" : "test_suggest"
        }
    }
}

结果

{
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "test_suggest": [
      {
         "text": "cat",
         "offset": 0,
         "length": 3,
         "options": [
            {
               "text": "cat dog",
               "score": 1
            }
         ]
      }
   ]
}

失败查询:

POST /test_index/_suggest
{
    "test_suggest":{
        "text":"dog",
        "completion": {
            "field" : "test_suggest"
        }
    }
}

结果

{
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "test_suggest": [
      {
         "text": "dog",
         "offset": 0,
         "length": 3,
         "options": []
      }
   ]
}

任何建议问题是什么以及如何使失败的查询工作?当使用标准分析器而不是空白分析器时,我得到相同的结果。我想使用多个单词每个输入字符串,如上面的例子所示。

I would expect the same result as the working query, matching 'cat dog'. Any suggestions what the problem is and how to make the failing query working? I get the same results when using the standard analyzer instead of the whitespace analyzer. I would like to use multiple words per input string as showed in the example above.

推荐答案

完成建议是一个前缀suggester ,这意味着它会尝试将您的查询与第一个几个字符的输入,它被给予。如果您想要发布的文档与文本狗匹配,则需要指定dog作为输入。

The completion suggester is a prefix suggester, meaning it tries to match your query to the first few characters of the inputs that it's been given. If you want the document you posted to match the text "dog", then you'll need to specify "dog" as an input.

PUT /test_index/item/1
{
   "test_suggest": {
      "input": [
         "cat dog",
         "elephant",
         "dog"
      ]
   }
}

根据我的经验,必须指定输入匹配的限制使完成建议者对于实现前缀匹配的其他方式不太有用。为此,我喜欢边缘图。我最近写了一篇关于使用可能会有帮助的ngrams的博文: http://blog.qbox .io / an-introduction-to-ngrams-in-elasticsearch

In my experience, the limitation of having to specify inputs to match makes completion suggesters less useful that other ways to implement prefix matching. I like edge ngrams for this purpose. I recently wrote a blog post about using ngrams that you might find helpful: http://blog.qbox.io/an-introduction-to-ngrams-in-elasticsearch

作为一个简单的例子,这里是一个映射,你可以使用

As a quick example, here is a mapping you could use

PUT /test_index
{
   "settings": {
      "analysis": {
         "filter": {
            "edge_ngram_filter": {
               "type": "edge_ngram",
               "min_gram": 2,
               "max_gram": 20
            }
         },
         "analyzer": {
            "edge_ngram_analyzer": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "lowercase",
                  "edge_ngram_filter"
               ]
            }
         }
      }
   },
   "mappings": {
      "item": {
         "properties": {
            "text_field": {
               "type": "string",
               "index_analyzer": "edge_ngram_analyzer",
               "search_analyzer": "standard"
            }
         }
      }
   }
}

然后以这样的方式索引文档:

then index the doc like this:

PUT /test_index/item/1
{
   "text_field": [
      "cat dog",
      "elephant"
   ]
}

,任何这些查询将返回:

and any of these queries will return it:

POST /test_index/_search
{
    "query": {
        "match": {
           "text_field": "dog"
        }
    }
}

POST /test_index/_search
{
    "query": {
        "match": {
           "text_field": "ele"
        }
    }
}

POST /test_index/_search
{
    "query": {
        "match": {
           "text_field": "ca"
        }
    }
}

这些代码全部在一起:

http://sense.qbox.io/gist/4a08fbb6e42c34ff8904badfaaeecc01139f96cf

这篇关于弹性搜索结果建议使用多个单词输入进行搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆