弹性搜索通配符搜索not_analyzed字段 [英] Elasticsearch wildcard search on not_analyzed field

查看:112
本文介绍了弹性搜索通配符搜索not_analyzed字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有如下设置和映射的索引:

I have an index like following settings and mapping;

{
  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "analyzer_keyword":{
                 "tokenizer":"keyword",
                 "filter":"lowercase"
              }
           }
        }
     }
  },
  "mappings":{
     "product":{
        "properties":{
           "name":{
              "analyzer":"analyzer_keyword",
              "type":"string",
              "index": "not_analyzed"
           }
        }
     }
  }
}

我正在努力在名称字段上进行通配符搜索的实现。我的示例数据如下:

I am struggling with making an implementation for wildcard search on name field. My example data like this;

[
{"name": "SVF-123"},
{"name": "SVF-234"}
]

查询;

http://localhost:9200/my_index/product/_search -d '
{
    "query": {
        "filtered" : {
            "query" : {
                "query_string" : {
                    "query": "*SVF-1*"
                }
            }
        }

    }
}'

它返回 SVF-123 SVF-234 。我认为,它仍然标记数据。它只能返回 SVF-123

It returns SVF-123,SVF-234. I think, it still tokenizes data. It must return only SVF-123.

你能帮忙吗?

提前感谢

推荐答案

我的解决方案冒险

我的问题。每当我改变我的一部分设置,一部分开始工作,但另一部分停止工作。让我给出我的解决方案历史:

I have started my case as you can see in my question. Whenever, I have changed a part of my settings, one part started to work, but another part stop working. Let me give my solution history:

1。)我将我的数据作为默认索引。这意味着,我的数据是分析作为默认值。这将导致我的问题。例如;

1.) I have indexed my data as default. This means, my data is analyzed as default. This will cause problem on my side. For example;

当用户开始搜索如 SVF-1 的关键字时,系统运行此查询:

When user started to search a keyword like SVF-1, system run this query:

{
    "query": {
        "filtered" : {
            "query" : {
                "query_string" : {
                    "analyze_wildcard": true,
                    "query": "*SVF-1*"
                }
            }
        }

    }
}

结果;

SVF-123
SVF-234

这是正常的,因为我的文档的名称字段是分析。这将查询分成令牌 SVF 1 SVF 匹配我的文档,虽然 1 不匹配。我这样跳过了我已经为我的字段创建了一个映射,使他们 not_analyzed

This is normal, because name field of my documents are analyzed. This splits query into tokens SVF and 1, and SVF matches my documents, although 1 does not match. I have skipped this way. I have create a mapping for my fields make them not_analyzed

{
  "mappings":{
     "product":{
        "properties":{
           "name":{
              "type":"string",
              "index": "not_analyzed"
           },
           "site":{
              "type":"string",
              "index": "not_analyzed"
           } 
        }
     }
  }
}

但是我的问题仍然存在。

but my problem continued.

2。)我想通过大量的研究尝试另一种方式。决定使用通配符查询
我的查询是;

2.) I wanted to try another way after lots of research. Decided to use wildcard query. My query is;

{
    "query": {
        "wildcard" : {
            "name" : {
                "value" : *SVF-1*"
             }
          }
      },
            "filter":{
                    "term": {"site":"pro_en_GB"}
            }
    }
}

这个查询有效,但是这里有一个问题,我的字段不再被分析了,我正在进行通配符查询,区分大小写是这里的问题,如果我搜索像 svf-1 ,它不返回任何东西,因为用户可以输入小写版本的查询。

This query worked, but one problem here. My fields are not_analyzed anymore, and I am making wildcard query. Case sensitivity is problem here. If I search like svf-1, it returns nothing. Since, user can input lowercase version of query.

3。

3.) I have changed my document structure to;

{
  "mappings":{
     "product":{
        "properties":{
           "name":{
              "type":"string",
              "index": "not_analyzed"
           },
           "nameLowerCase":{
              "type":"string",
              "index": "not_analyzed"
           }
           "site":{
              "type":"string",
              "index": "not_analyzed"
           } 
        }
     }
  }
}

我有为名称另外添加一个名为 nameLowerCase 的字段。当我索引我的文档时,我正在设置我的文档,如:

I have adde one more field for name called nameLowerCase. When I am indexing my document, I am setting my document like;

{
    name: "SVF-123",
    nameLowerCase: "svf-123",
    site: "pro_en_GB"
}

这里,我将查询关键字转换为小写,并对新的 nameLowerCase 索引进行搜索操作。并显示名称字段。

Here, I am converting query keyword to lowercase and make search operation on new nameLowerCase index. And displaying name field.

我的查询的最终版本是;

Final version of my query is;

{
    "query": {
        "wildcard" : {
            "nameLowerCase" : {
                "value" : "*svf-1*"
             }
          }
      },
            "filter":{
                    "term": {"site":"pro_en_GB"}
            }
    }
}

还有一种方法可以使用 multi_field 。我的查询包含破折号( - ),并且遇到一些问题。

Now it works. There is also one way to solve this problem by using multi_field. My query contains dash(-), and faced some problems.

很多谢谢@Alex Brasetvik的详细解释和努力

Lots of thanks to @Alex Brasetvik for his detailed explanation and effort

这篇关于弹性搜索通配符搜索not_analyzed字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆