索引短语搜索和部分匹配的字段 [英] Indexing a field for both phrase searching and partial matches

查看:125
本文介绍了索引短语搜索和部分匹配的字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个对象上创建一个索引,并希望能够进行全部短语搜索以及部分匹配。该类型称为deponent,简化的索引创建如下所示:

I am creating an index on an object, and wanting to be able to do both full phrase searches as well as partial matches. The type is called "deponent", and a simplified index creation is shown below:

{
   "deponent": {
      "properties": {         
         "name": {
            "type": "multi_field",
            "fields": {
               "name": {
                  "type": "string"
               },
               "full": {
                  "type": "string",
                  "index": "not_analyzed",
                  "omit_norms": true,
                  "index_options": "docs",
                  "include_in_all": false
               }
            }
         }
      }
   }
}

这样做的目的是为了索引名称字段两次:一次,字段中的单个单词不会分解(name.full),一旦单词被分解(name.name)。

The intent of this is to index the values in the "name" field twice: once where the individual words within the field are not broken up (name.full) and once where the words are broken up (name.name).

我有一个已经被索引的文档的名字字段设置为博士Danny Watson我希望在执行术语查询时会发生以下行为(根据文档不对其查询字符串进行分析):

I have a document which has been indexed whose name field is set to "Dr. Danny Watson". I would expect the following behaviors to occur when executing a term query (whose query string is not analyzed according to the documentation):


  1. 使用Dr.搜索 name.full 时记录
    应该返回

  2. 使用Watson搜索 name.full 时,记录不应该返回

  3. 使用Dr.搜索 name.name 记录不应返回

  4. 使用Watson搜索 name.name 时,记录应该返回

  1. When searching name.full using "Dr. Danny Watson", the record should be returned
  2. When searching name.full using "Watson", the record should not be returned
  3. When searching name.name using "Dr. Danny Watson", the record should not be returned
  4. When searching name.name using "Watson", the record should be returned

上面四点的查询:

1 - 按预期工作(返回记录)

1 - works as expected (returns the record)

{
    "query" : {
        "term": {
           "name.full": {
              "value": "Dr. Danny Watson"
           }
        }
    }   
}

2 - 按预期工作(不返回记录)

2 - works as expected (does not return the record)

{
    "query" : {
        "term": {
           "name.full": {
              "value": "Watson"
           }
        }
    }   
}

3 - 按预期工作(不返回记录)

3 - works as expected (does not return the record)

{
    "query" : {
        "term": {
           "name.name": {
              "value": "Dr. Danny Watson"
           }
        }
    }   
}

4 - NOT 按预期工作 - 记录不返回

4 - does NOT work as expected - the record is not returned

{
    "query" : {
        "term": {
           "name.name": {
              "value": "Watson"
           }
        }
    }   
}

所以看来我对某事的理解是有缺陷的。 ?

So it seems my understanding of something is flawed. What am I missing?

推荐答案

您不需要调用字段name.name,原始名称的多字段用作默认值,所以你应该只使用name。

还要确保索引和搜索分析器是有序的(所以例如索引条款和搜索项改为小写)。

You don't need to call the field "name.name". The multi-field with the original name is used as the default, so you should use just "name" for that.
Also it's always good to make sure the index and search analyzers are in order (so for instance both your indexed terms and the search term are changed to lower case).

这篇关于索引短语搜索和部分匹配的字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆