弹性搜索通配符搜索not_analyzed字段 [英] Elasticsearch wildcard search on not_analyzed field

查看：112 发布时间：2017/8/7 1:02:31 search lucene elasticsearch tokenize

本文介绍了弹性搜索通配符搜索not_analyzed字段的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有如下设置和映射的索引：

I have an index like following settings and mapping;

{
  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "analyzer_keyword":{
                 "tokenizer":"keyword",
                 "filter":"lowercase"
              }
           }
        }
     }
  },
  "mappings":{
     "product":{
        "properties":{
           "name":{
              "analyzer":"analyzer_keyword",
              "type":"string",
              "index": "not_analyzed"
           }
        }
     }
  }
}

我正在努力在名称字段上进行通配符搜索的实现。我的示例数据如下：

I am struggling with making an implementation for wildcard search on name field. My example data like this;

[
{"name": "SVF-123"},
{"name": "SVF-234"}
]

查询;

http://localhost:9200/my_index/product/_search -d '
{
    "query": {
        "filtered" : {
            "query" : {
                "query_string" : {
                    "query": "*SVF-1*"
                }
            }
        }

    }
}'

它返回 SVF-123 ， SVF-234 。我认为，它仍然标记数据。它只能返回 SVF-123 。

It returns SVF-123,SVF-234. I think, it still tokenizes data. It must return only SVF-123.

你能帮忙吗？

提前感谢

推荐答案

我的解决方案冒险

我的问题。每当我改变我的一部分设置，一部分开始工作，但另一部分停止工作。让我给出我的解决方案历史：

I have started my case as you can see in my question. Whenever, I have changed a part of my settings, one part started to work, but another part stop working. Let me give my solution history:

1。）我将我的数据作为默认索引。这意味着，我的数据是分析作为默认值。这将导致我的问题。例如;

1.) I have indexed my data as default. This means, my data is analyzed as default. This will cause problem on my side. For example;

当用户开始搜索如 SVF-1 的关键字时，系统运行此查询：

When user started to search a keyword like SVF-1, system run this query:

{
    "query": {
        "filtered" : {
            "query" : {
                "query_string" : {
                    "analyze_wildcard": true,
                    "query": "*SVF-1*"
                }
            }
        }

    }
}

结果;

SVF-123
SVF-234

这是正常的，因为我的文档的名称字段是分析。这将查询分成令牌 SVF 和 1 和 SVF 匹配我的文档，虽然 1 不匹配。我这样跳过了我已经为我的字段创建了一个映射，使他们 not_analyzed

This is normal, because name field of my documents are analyzed. This splits query into tokens SVF and 1, and SVF matches my documents, although 1 does not match. I have skipped this way. I have create a mapping for my fields make them not_analyzed

{
  "mappings":{
     "product":{
        "properties":{
           "name":{
              "type":"string",
              "index": "not_analyzed"
           },
           "site":{
              "type":"string",
              "index": "not_analyzed"
           } 
        }
     }
  }
}

但是我的问题仍然存在。

but my problem continued.

2。）我想通过大量的研究尝试另一种方式。决定使用通配符查询。
我的查询是;

2.) I wanted to try another way after lots of research. Decided to use wildcard query. My query is;

{
    "query": {
        "wildcard" : {
            "name" : {
                "value" : *SVF-1*"
             }
          }
      },
            "filter":{
                    "term": {"site":"pro_en_GB"}
            }
    }
}

这个查询有效，但是这里有一个问题，我的字段不再被分析了，我正在进行通配符查询，区分大小写是这里的问题，如果我搜索像 svf-1 ，它不返回任何东西，因为用户可以输入小写版本的查询。

This query worked, but one problem here. My fields are not_analyzed anymore, and I am making wildcard query. Case sensitivity is problem here. If I search like svf-1, it returns nothing. Since, user can input lowercase version of query.

3。

3.) I have changed my document structure to;

{ "mappings":{ "product":{ "properties":{ "name":{ "type":"string", "index": "not_analyzed" }, "nameLowerCase":{ "type":"string", "index": "not_analyzed" } "site":{ "type":"string", "index": "not_analyzed" } } } } }

我有为名称另外添加一个名为 nameLowerCase 的字段。当我索引我的文档时，我正在设置我的文档，如：

I have adde one more field for name called nameLowerCase. When I am indexing my document, I am setting my document like;

{ name: "SVF-123", nameLowerCase: "svf-123", site: "pro_en_GB" }

这里，我将查询关键字转换为小写，并对新的 nameLowerCase 索引进行搜索操作。并显示名称字段。

Here, I am converting query keyword to lowercase and make search operation on new nameLowerCase index. And displaying name field.

我的查询的最终版本是;

Final version of my query is;

{ "query": { "wildcard" : { "nameLowerCase" : { "value" : "*svf-1*" } } }, "filter":{ "term": {"site":"pro_en_GB"} } } }

还有一种方法可以使用 multi_field 。我的查询包含破折号（ - ），并且遇到一些问题。

Now it works. There is also one way to solve this problem by using multi_field. My query contains dash(-), and faced some problems.

很多谢谢@Alex Brasetvik的详细解释和努力

Lots of thanks to @Alex Brasetvik for his detailed explanation and effort

这篇关于弹性搜索通配符搜索not_analyzed字段的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

弹性搜索通配符搜索not_analyzed字段 [英] Elasticsearch wildcard search on not_analyzed field

问题描述

推荐答案

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

弹性搜索通配符搜索not_analyzed字段 [英] Elasticsearch wildcard search on not_analyzed field

问题描述

推荐答案

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭