不区分大小写的弹性搜索 [英] case insensitive elasticsearch with uppercase or lowercase

查看:204
本文介绍了不区分大小写的弹性搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在进行弹性搜索,而我正面临一个问题。如果有任何机构给了我一个提示,我真的很感激。



我想分析一个由不同条目组成的字段名称或描述。例如有人想搜索Sara。如果他进入SARA,SAra或sara。他应该能够得到萨拉。
弹性搜索使用分析器,使一切都小写。



我想实现它不区分大小写,不管用户输入大写或小写名称,他/她应该得到结果。
我正在使用ngram过滤器来搜索名字和小写,这使得它不区分大小写。但是我想确保一个人得到结果,即使他进入大写或小写。



在弹性搜索中有没有办法做到这一点?

  {设置:{

分析:{
:{
ngram_filter:{
type:ngram,
min_gram:1,
max_gram:80
}

analyzer:{
index_ngram:{
type:custom,
tokenizer:keyword,
过滤器:[ngram_filter,小写]
},

example.js文件,其中包含json示例和search.txt文件来解释我的问题。我希望现在的问题会更加清晰。
这是我保存这两个文件的onedrive的链接。
https://1drv.ms/f/s!AsW4Pb3Y55Qjb34OtQI7qQotLzc

解决方案

有什么具体的原因你使用ngram吗? Elasticsearch在查询中使用相同的分析器以及您所索引的文本,除非已经明确指定了search_analyzer,如在他的答案中提到的@Adam。在您的情况下,使用标准可能就足够了我使用小写过滤器



创建了一个具有以下设置和映射的索引:

settings:{
analysis:{
analyzer:{
custom_analyzer:{
type:custom,
tokenizer:standard,
filter:[
smallcase
]
}
}
}
},
mappings:{
typehere:{
properties:{
name:{
type:string,
analyzer:custom_analyzer
},
description:{
type:string,
analyzer:custom_analyzer
}
}
}
}
}

索引两个文件
Doc 1

  PUT / test_index / test_mapping / 1 
{
名称:Sara Connor,
说明:我的真实姓名是莎拉康纳。
}

Doc 2

  PUT / test_index / test_mapping / 2 
{
name:John Connor,
描述:有一天我可以拯救人类。
}

做一个简单的搜索

  POST / test_index / _search?query = sara 
{
query:{
match:{
name :SARA
}
}
}

只收回第一个文件。我尝试用萨拉和萨拉也是一样的结果。

  {
taken:12,
timed_out:false,
_ $
total:5,
success:5,
failed:0
},
hits:{
total:1,
max_score:0.19178301,
hits:[
{
_index:test_index,
_type :test_mapping,
_id:1,
_score:0.19178301,
_source:{
name:Sara Connor
说明:我的真名是莎拉康纳。
}
}
]
}
}


I am working with elastic search and I am facing a problem. if any body gave me a hint , I will really thankful.

I want to analyze a field "name" or "description" which consist of different entries . e.g someone want to search Sara. if he enter SARA, SAra or sara. he should be able to get Sara. elastic search uses analyzer which makes everything lowercase.

I want to implement it case insensitive regardless of user input uppercase or lowercase name, he/she should get results. I am using ngram filter to search names and lowercase which makes it case insensitive. But I want to make sure that a person get results if even he enters in uppercase or lowercase.

Is there any way to do this in elastic search?

{"settings": {

        "analysis": {
            "filter": {
                "ngram_filter": {
                    "type": "ngram",
                    "min_gram": 1,
                    "max_gram": 80
                }
            },
            "analyzer": {
                "index_ngram": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "filter": [ "ngram_filter", "lowercase" ]
                },

I attach the example.js file which include json example and search.txt file to explain my problem . I hope my problem will be more clear now. this is the link to onedrive where I kept both files. https://1drv.ms/f/s!AsW4Pb3Y55Qjb34OtQI7qQotLzc

解决方案

Is there any specific reason you are using ngram? Elasticsearch uses the same analyzer on the "query" as well as the text you index - unless search_analyzer is explicitly specified, as mentioned by @Adam in his answer. In your case it might be enough to use a standard tokenizer with a lowercase filter

I created an index with the following settings and mapping:

{
   "settings": {
      "analysis": {
         "analyzer": {
            "custom_analyzer": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "lowercase"
               ]
            }
         }
      }
   },
   "mappings": {
      "typehere": {
         "properties": {
            "name": {
               "type": "string",
               "analyzer": "custom_analyzer"
            },
            "description": {
               "type": "string",
               "analyzer": "custom_analyzer"
            }
         }
      }
   }
}

Indexed two documents Doc 1

PUT /test_index/test_mapping/1
    {
        "name" : "Sara Connor",
        "Description" : "My real name is Sarah Connor."
    }

Doc 2

PUT /test_index/test_mapping/2
    {
        "name" : "John Connor",
        "Description" : "I might save humanity someday."
    }

Do a simple search

POST /test_index/_search?query=sara
{
    "query" : {
        "match" : {
            "name" : "SARA"
        }
    }
}

And get back only the first document. I tried with "sara" and "Sara" also, same results.

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.19178301,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_mapping",
        "_id": "1",
        "_score": 0.19178301,
        "_source": {
          "name": "Sara Connor",
          "Description": "My real name is Sarah Connor."
        }
      }
    ]
  }
}

这篇关于不区分大小写的弹性搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆