弹性搜索允许用户选择使用完全匹配 [英] Elastic Search allow user to optionally use an exact match

查看:46
本文介绍了弹性搜索允许用户选择使用完全匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于我只对3个字段感兴趣,因此我在Elastic Search中使用了多匹配查询.

I'm using a multi_match query in Elastic Search since I'm only interested in 3 fields.

    query: {
      filtered: {
        query: {
          multi_match: {
            fields: ['subject', 'text', 'task_comments.text'],
            query: USER_INPUT
          }
        }
      }
    }

如果我搜索Apple TV,则会获得"Apple TV",Apple和TV的结果.

If I search for Apple TV, I get results with "Apple TV", Apple and TV.

我希望用户根据自己的输入选择完全匹配的内容.因此,如果他们搜索"Apple TV"(带双引号),则它应仅返回包含"Apple TV"的结果.仅包含Apple的结果不应返回.

I would like users to optionally search for exact matches based on their input. So, if they search for "Apple TV" (with double quotes), it should only return results that contains "Apple TV". Results that only contains Apple shouldn't be returned.

是否只能使用Elastic Search做到这一点?

Is it possible to do that only with Elastic Search?

还是我需要根据用户的输入更改应用程序生成的查询?

Or do I need to change the query generated by my application based on user's input?

推荐答案

您可以设置索引,使其还具有未分析的原始"

You can set up your index to also have a "raw" un-analyzed sub-field for each field you want to search against.

作为一个玩具示例,我设置了一个简单的索引,并添加了一些文档:

As a toy example, I set up a simple index, and added a few docs:

PUT /test_index
{
    "mappings": {
        "doc":{
            "properties": {
                "text_field": {
                    "type": "string",
                    "analyzer": "standard",
                    "fields": {
                        "raw": {
                            "type": "string", 
                            "index": "not_analyzed"
                        }
                    }
                }
            }
        }
    }
}

POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"text_field": "Apple TV"}
{"index":{"_id":2}}
{"text_field": "Apple iPhone"}
{"index":{"_id":3}}
{"text_field": "Apple MacBook"}

该索引使用标准分析器用于主字段(由于它是默认字段,因此请指定多余的字段,但我想使其明确显示),而对于子字段则完全没有分析器.

This index uses the standard analyzer for the main field (specifying it is redundant since it's the default, but I wanted to make it explicit), and no analyzer at all for the sub-field.

因此,如果我针对主要领域进行搜索,则会获得全部三份文档:

So if I search against the main field, I get all three docs back:

POST /test_index/_search
{
    "query": {
        "match": {
           "text_field": "Apple TV"
        }
    }
}
...
{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 0.98479235,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "1",
            "_score": 0.98479235,
            "_source": {
               "text_field": "Apple TV"
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 0.10063131,
            "_source": {
               "text_field": "Apple iPhone"
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "3",
            "_score": 0.10063131,
            "_source": {
               "text_field": "Apple MacBook"
            }
         }
      ]
   }
}

但是,如果我搜索"raw"子字段,那么我只会取回一个文档:

But if I search against the "raw" sub-field, I only get back the one doc:

POST /test_index/_search
{
    "query": {
        "match": {
           "text_field.raw": "Apple TV"
        }
    }
}
...
{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1.4054651,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "1",
            "_score": 1.4054651,
            "_source": {
               "text_field": "Apple TV"
            }
         }
      ]
   }
}

您应该能够对每个字段都执行此操作,以使其与您的 multi_match 查询一起使用.另外,您可以使用 _all字段进行设置,然后对它使用"match" 查询.

You should be able to do this for each of your fields to get it working with your multi_match query. Alternatively, you could set something up with the _all field and then just use a "match" query against it.

这里的代码都放在一个地方:

Here is the code all in one place:

http://sense.qbox.io/gist/31ff17997b4971b6515f019ab514f9a17da1a606

这篇关于弹性搜索允许用户选择使用完全匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆