弹性搜索精确字段值检索 [英] Elastic search exact field value retrieval

查看:53
本文介绍了弹性搜索精确字段值检索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在我遇到了另一个问题 - 如何只选择适合模糊查询的字段值?假设野外大学有不同的名称,例如:教育:[麻省理工学院,斯坦福大学,密歇根大学]但我只想选择斯坦福大学.假设我可以对每个模糊查询进行聚合,这将返回来自实地教育的所有计数和大学的所有名称.我需要什么 - 仅获取与模糊查询匹配的精确值的聚合.假设我对斯坦福大学进行模糊查询,并且实地教育的值是 [MIT, Stanfordddd University, Michigan University],我希望查询只带回Stanfordddd University"的值,而不是全部三个他们.谢谢!

Now I bumped into other problem - how can I choose only the values of the field which fit fuzzy query? Let's say there are different names in the field university like: education : [MIT, Stanford University, Michingan university] but I want to select only stanford university. Let's say I can do aggregation on each fuzzy query, which would return ALL counts and all names of universities from field education. What I need - to get aggregations only of exact values which match fuzzy query. Let's say if I do a fuzzy query for Stanford University and a field education holds values of [MIT, Stanfordddd University, Michigan University], I would like a query to bring me back only a value of 'Stanfordddd University', not all three of them. Thanks!

推荐答案

对于此功能,您的字段 education 必须属于 nested 并且您使用 inner_hits 功能检索唯一相关的值.

For this feature, your field education must be of type nested and you make use of inner_hits feature to retrieve the only concerned value.

以下是您的字段 education 在这种情况下的映射示例:

Below is the sample mapping as how your field education would be in this case:

映射:

PUT my_index
{
  "mappings":{
    "mydocs":{
      "properties":{
        "education": {
          "type": "nested"
        }
      }
    }
  }
}

示例文档:

POST my_index/mydocs/1
{
  "education": [
  {
    "value": "Stanford University"
  },
  {
    "value": "Harvard University"
  }]
}

POST my_index/mydocs/2
{
  "education": [
  {
    "value": "Stanford University"
  },
  {
    "value": "Princeton University"
  }]
}

对嵌套字段的模糊查询:

POST my_index/_search
{  
   "query":{  
      "nested":{  
         "path":"name",
         "query":{  
            "bool":{  
               "must":[  
                  {  
                     "span_near":{  
                        "clauses":[  
                           {  
                              "span_multi":{  
                                 "match":{  
                                    "fuzzy":{  
                                       "name.value":{  
                                          "value":"Stanford",
                                          "fuzziness":2
                                       }
                                    }
                                 }
                              }
                           },
                           {  
                              "span_multi":{  
                                 "match":{  
                                    "fuzzy":{  
                                       "name.value":{  
                                          "value":"University",
                                          "fuzziness":2
                                       }
                                    }
                                 }
                              }
                           }
                        ],
                        "slop":0,
                        "in_order":false
                     }
                  }
               ]
            }
         },
         "inner_hits":{}
      }
   }
}

示例响应:

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "my_index",
        "_type": "mydocs",
        "_id": "2",
        "_score": 0.6931472,
        "_source": {
          "education": [
            {
              "value": "Stanford University"
            },
            {
              "value": "Princeton University"
            }
          ]
        },
        "inner_hits": {
          "name": {
            "hits": {
              "total": 1,
              "max_score": 0.6931472,
              "hits": [
                {
                  "_index": "my_index",
                  "_type": "mydocs",
                  "_id": "2",
                  "_nested": {
                    "field": "education",
                    "offset": 0
                  },
                  "_score": 0.6931472,
                  "_source": {
                    "value": "Stanford University"
                  }
                }
              ]
            }
          }
        }
      },
      {
        "_index": "my_index",
        "_type": "mydocs",
        "_id": "1",
        "_score": 0.6931472,
        "_source": {
          "education": [
            {
              "value": "Stanford University"
            },
            {
              "value": "Harvard University"
            }
          ]
        },
        "inner_hits": {
          "name": {
            "hits": {
              "total": 1,
              "max_score": 0.6931472,
              "hits": [
                {
                  "_index": "my_index",
                  "_type": "mydocs",
                  "_id": "1",
                  "_nested": {
                    "field": "education",
                    "offset": 0
                  },
                  "_score": 0.6931472,
                  "_source": {
                    "value": "Stanford University"
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

请注意 inner_hits 部分,您会在其中看到只有具有 Stanford University 的相关/相关文档将被返回.

Notice the section inner_hits where you'd see that only the relevant/concerned document having Stanford University would be returned.

Elasticsearch 默认返回整个文档作为响应.在某种程度上,您可以使用 _source 基于 fields 执行过滤,但是它不允许您过滤值.

Elasticsearch by default returns the entire document as response. To certain extent you can perform filtering based on fields using _source, however it doesn't allow you to filter values.

希望这会有所帮助!

这篇关于弹性搜索精确字段值检索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆