仅返回对象中包含特定值的数组元素 [英] Return only elements of an array in an object that contain a certain value

查看:65
本文介绍了仅返回对象中包含特定值的数组元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在弹性搜索索引中获得了以下文档:

  {"type":"foo",组件": [{"id":"1234123",,"data_collections":[{"date_time":"2020-03-02T08:14:48 + 00:00","group":"1","group_description":"group1",措施":[{"measure_name":"MEASURE_1",实际":"23.34"},{"measure_name":"MEASURE_2",实际":"5"},{"measure_name":"MEASURE_3","actual":"string_message"},{"measure_name":"MEASURE_4","actual":"another_string"}]},{"date_time":"2020-03-03T08:14:48 + 00:00","group":"2","group_description":"group2",措施":[{"measure_name":"MEASURE_1",实际":"23.34"},{"measure_name":"MEASURE_4","actual":"foo"},{"measure_name":"MEASURE_5","actual":"bar"},{"measure_name":"MEASURE_6",实际":"4"}]}]}]} 

现在,我正在尝试查找此文档的映射和查询,因此结果将仅包含我所插入的组和measure_names.到目前为止,我仍然可以进行查询,但是我将始终检索整个内容.这是不可行的文档,因为测度的数组可能很大,而且大多数时候我想要一个小的子集.

例如,我要搜索具有"group":"1" "measure_name":"MEASURE _" 的文档,以及要获得的结果看起来像这样:

  {"_id":"oiqwueou8931283u12",_来源": {"type":"foo",组件": [{"id":"1234123",,"data_collections":[{"date_time":"2020-03-02T08:14:48 + 00:00","group":"1","group_description":"group1",措施":[{"measure_name":"MEASURE_1",实际":"23.34"}]}]}]}} 

我认为与我正在寻找的参数最接近的是 source 参数,但是据我所知,没有办法过滤诸如 {"measure_name"这样的值:{"value":"MEASURE_1"}}

谢谢.

解决方案

想到的最简单的映射是

  PUT timo{映射":{特性": {组件": {"type":嵌套",特性": {"data_collections":{"type":嵌套",特性": {措施":{"type":嵌套"}}}}}}}} 

,搜索查询应为

  GET timo/_search{"_source":["inner_hits","type","components.id"],询问": {布尔":{必须": [{嵌套":{"path":"components.data_collections",询问": {学期": {"components.data_collections.group.keyword":{值":"1"}}},"inner_hits":{}}},{嵌套":{"path":"components.data_collections.measures",询问": {学期": {"components.data_collections.measures.measure_name.keyword":{值":"MEASURE_1"}}},"inner_hits":{}}}]}}} 

注意

您现在已经精确地拥有了所需的属性,因此进行一些后期处理将为您提供所需的格式!


我不熟悉使用更干净的方法来进行此操作,但是如果大家都愿意,我会很高兴学习它.

I've got the following document in an elastic search index:

{
    "type": "foo",
    "components": [{
            "id": "1234123", ,
            "data_collections": [{
                    "date_time": "2020-03-02T08:14:48+00:00",
                    "group": "1",
                    "group_description": "group1",
                    "measures": [{
                            "measure_name": "MEASURE_1",
                            "actual": "23.34"
                        }, {
                            "measure_name": "MEASURE_2",
                            "actual": "5"
                        }, {
                            "measure_name": "MEASURE_3",
                            "actual": "string_message"
                        }, {
                            "measure_name": "MEASURE_4",
                            "actual": "another_string"
                        }
                    ]
                },
                {
                    "date_time": "2020-03-03T08:14:48+00:00",
                    "group": "2",
                    "group_description": "group2",
                    "measures": [{
                            "measure_name": "MEASURE_1",
                            "actual": "23.34"
                        }, {
                            "measure_name": "MEASURE_4",
                            "actual": "foo"
                        }, {
                            "measure_name": "MEASURE_5",
                            "actual": "bar"
                        }, {
                            "measure_name": "MEASURE_6",
                            "actual": "4"
                        }
                    ]
                }
            ]
        }
    ]
}

Now I'm trying to figure out a mapping and a query for this document so the result would only contain the groups and measure_names I am interesed in. So far I'm able to query but I'll always retrieve the whole document which is not feasible since the array of measures can be quite large and most of the time I'd like a small subset.

For example I'm search for documents with "group": "1" and "measure_name": "MEASURE_" and the result I'd like to achieve looks like this:

{
    "_id": "oiqwueou8931283u12",
    "_source": {
        "type": "foo",
        "components": [{
                "id": "1234123", ,
                "data_collections": [{
                        "date_time": "2020-03-02T08:14:48+00:00",
                        "group": "1",
                        "group_description": "group1",
                        "measures": [{
                                "measure_name": "MEASURE_1",
                                "actual": "23.34"
                            }
                        ]
                    }
                ]
            }
        ]
    }
}

I think what comes close to what I am looking for is the source parameter, but as far as I know there is no way to filter for values like {"measure_name": {"value": "MEASURE_1"}}

Thanks.

解决方案

The simplest mapping that comes to mind is

PUT timo
{
  "mappings": {
    "properties": {
      "components": {
        "type": "nested",
        "properties": {
          "data_collections": {
            "type": "nested",
            "properties": {
              "measures": {
                "type": "nested"
              }
            }
          }
        }
      }
    }
  }
}

and the search query would be

GET timo/_search
{
  "_source": ["inner_hits", "type", "components.id"], 
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "components.data_collections",
            "query": {
              "term": {
                "components.data_collections.group.keyword": {
                  "value": "1"
                }
              }
            },
            "inner_hits": {}
          }
        },
        {
          "nested": {
            "path": "components.data_collections.measures",
            "query": {
              "term": {
                "components.data_collections.measures.measure_name.keyword": {
                  "value": "MEASURE_1"
                }
              }
            },
            "inner_hits": {}
          }
        }
      ]
    }
  }
}

Notice the inner_hits param under each subquery and that the _source param is limited so that we don't return the whole hit, but rather only the subgroups that did match. type and component.id cannot be "seen" in the nested fields so we've included them explicitly.

The response should then look like this:

You now have precisely the attributes you need so a bit of post-processing will get you the desired format!


I'm not familiar w/ a cleaner way of doing this but if any of y'all do, I'd be glad to learn it.

这篇关于仅返回对象中包含特定值的数组元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆