弹性搜索查询字符串不要按字部分搜索 [英] elasticsearch query string dont search by word part

查看：95 发布时间：2017/8/6 23:07:12 elasticsearch query-string

本文介绍了弹性搜索查询字符串不要按字部分搜索的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我发送此请求

  curl -XGET'host / process_test_3 / 14 / _search'-d'{
query：{
query_string：{
query：\* cor interface * \，
fields：[title ，obj_id] 
} 
} 
}'

我得到正确的结果

  {
taken：12，
timed_out 
_shards：{
total：5，
successful：5，
failed：0 
}，
 hits：{
total：3，
max_score：5.421598，
hits：[
 {
_index：process_test_3 ，
_type：14，
_id：141_dashboard_14，
_score：5.421598，
_source：{
obj_type ：dashboard，
obj_id：141，
title：Cor Interface Monitoring
} 
} 
] 
 } 
}

但是当我想按字部分搜索时，例如

  curl -XGET'host / process_test_3 / $ / $$$$$$$$$$$$ ，
fields：[title，obj_id] 
} 
} 
}'

我没有得到任何结果：

  {
take：4，
timed_out：false，
_shards：{
total：5，
success：5，
 失败：0 
}，
hits：{
total：0，
max_score：null，
hits 
} 
}

我做错了什么？

解决方案

这是因为您的标题字段可能已被标准分析器（默认设置）和标题 Cor Interface Monitoring 已被标记为三个令牌 cor ， interface 和监视。

为了搜索任何字符串的子字符串，您需要创建一个自定义分析器利用 ngram令牌过滤器为了也索引你的每个令牌的所有子字符串。

你可以这样创建你的索引：

  curl -XPUT localhost：9200 / process_test_3 -d'{
settings：{
analysis：{
analyzer：{
子串_analyzer：{
tokenizer：standard，
filter：[smallcase，substring] 
} 
}，
 ：{
substring：{
type：nGram，
min_gram：2，
max_gram：15 
} 
 
 
 
mappings：{
14：{
properties：{
title 
type：string，
analyzer：substring_analyzer
} 
} 
} 
} 
}'

然后，您可以重新索引您的数据。这样做是标题 Cor Interface Monitoring 现在将被标记为：

co </ code>， cor ，或

 
   in ， int ， inte   inter ， interf 等等
 
   mo ， mon ， moni 等
 
  
 
 ，以便您的第二个搜索查询现在将返回您期望的文档，因为令牌 cor 和 inter 现在匹配。
 
I'm sending this request
curl -XGET 'host/process_test_3/14/_search' -d '{
  "query" : {
    "query_string" : {
      "query" : "\"*cor interface*\"",
      "fields" : ["title", "obj_id"]
    }
  }
}'
And I'm getting correct result
{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 5.421598,
    "hits": [
      {
        "_index": "process_test_3",
        "_type": "14",
        "_id": "141_dashboard_14",
        "_score": 5.421598,
        "_source": {
          "obj_type": "dashboard",
          "obj_id": "141",
          "title": "Cor Interface Monitoring"
        }
      }
    ]
  }
}
But when I want to search by word part, as example
curl -XGET 'host/process_test_3/14/_search' -d '
{
  "query" : {
    "query_string" : {
      "query" : "\"*cor inter*\"",
      "fields" : ["title", "obj_id"]
    }
  }
}'
I'm getting no results back:
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : []
  }
}
What am I doing wrong?   
 解决方案 
This is because your title field has probably been analyzed by the standard analyzer (default setting) and the title Cor Interface Monitoring has been tokenized as the three tokens cor, interface and monitoring.

In order to search any substring of words, you need to create a custom analyzer which leverages the ngram token filter in order to also index all substrings of each of your tokens.

You can create your index like this:
curl -XPUT localhost:9200/process_test_3 -d '{
  "settings": {
    "analysis": {
      "analyzer": {
        "substring_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "substring"]
        }
      },
      "filter": {
        "substring": {
          "type": "nGram",
          "min_gram": 2,
          "max_gram": 15
        }
      }
    }
  },
  "mappings": {
    "14": {
      "properties": {
        "title": {
          "type": "string",
          "analyzer": "substring_analyzer"
        }
      }
    }
  }
}'
Then you can reindex your data. What this will do is that the title Cor Interface Monitoring will now be tokenized as:


co, cor, or
in, int, inte, inter, interf, etc
mo, mon, moni, etc


so that your second search query will now return the document you expect because the tokens cor and inter will now match.

                        这篇关于弹性搜索查询字符串不要按字部分搜索的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

弹性搜索查询字符串不要按字部分搜索 [英] elasticsearch query string dont search by word part

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

弹性搜索查询字符串不要按字部分搜索 [英] elasticsearch query string dont search by word part

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭