在elasticsearch中搜索字符串数组中的确切字段 [英] Search for exact field in an array of strings in elasticsearch

查看:526
本文介绍了在elasticsearch中搜索字符串数组中的确切字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Elasticsearch版本:7.1.1

您好,我尝试了很多,但是在索引中找不到任何解决方法
,我有一个包含字符串的字段。

Hi, I try a lot but could not found any solution in my index, I have a field which is containing strings.

因此,例如,我有两个位置数组中包含不同值的文档。

so, for example, I have two documents containing different values in locations array.

文档1:

"doc" : {
            "locations" : [
              "Cloppenburg",
              "Berlin"
           ]
       }

文档2:

"doc" : {
                "locations" : [
                  "Landkreis Cloppenburg",
                  "Berlin"
                ]
              }

用户请求搜索术语克洛彭堡
,而我只想返回包含术语克洛彭堡的那些文档
,而不是 Landkreis Cloppenburg
结果应仅包含 Document-1
,但我的查询返回了两个文档。

a user requests a search for a term Cloppenburg and I want to return only those documents which contain term Cloppenburg and not Landkreis Cloppenburg. the results should contain only Document-1. but my query is returning both documents.

我正在使用以下查询,并同时返回了两个文档。
有人可以帮忙吗?

I am using the following query and getting both documents back. can someone please help me out in this.

GET /my_index/_search
     {
        "query": {
            "bool": {
                "must": [
                    {
                        "match": {
                            "doc.locations": {
                                "query": "cloppenburg",
                                "operator": "and"
                            }
                        }
                    }
                ]
            }
        }
    }


推荐答案

问题已到您使用的是文本字段和 match 查询。

The issue is due to your are using the text field and match query.

分析匹配查询并使用与索引时使用的搜索词相同的分析器,这是一个标准分析器,如果文本领域。这会在您的情况下在 Landkreis Cloppenburg 上打破空白文本,将创建两个令牌 landkreis cloppenburg 索引和搜索时间,甚至 cloppenburg 都将与文档匹配。

Match queries are analyzed and used the same analyzer of search terms which is used at index time, which is a standard analyzer in case of text fields. which breaks text on whitespace on in your case Landkreis Cloppenburg will create two tokens landkreis and cloppenburg both index and search time and even cloppenburg will match the document.

解决方案:使用关键字字段

索引定义

{
    "mappings": {
        "properties": {
            "location": {
                "type": "keyword"
            }
        }
    }
}

索引两个文档,然后使用相同的搜索查询

Index your both docs and then use same search query

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "location": {
                            "query": "Cloppenburg"
                        }
                    }
                }
            ]
        }
    }

}

结果

 "hits": [
            {
                "_index": "location",
                "_type": "_doc",
                "_id": "2",
                "_score": 0.6931471,
                "_source": {
                    "location": "Cloppenburg"
                }
            }
        ]

这篇关于在elasticsearch中搜索字符串数组中的确切字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆