包含动态数据/nested_objects的聚合 [英] Aggregations with dynamic data / nested_objects

本文介绍了包含动态数据/nested_objects的聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试汇总ElasticSearch中动态映射的字段.

I'm trying to aggregate over dynamically mapped fields in ElasticSearch.

例如:

POST test/_doc/1
{
    "settings": {
        "range": {
            "value": 200,
            "display": "200 km"
        },
        "transmitter": {
            "value": 1.2,
            "display": "1.2 Ghz"
        }
    }
}

设置下的属性是动态的.本质上,我需要这样的查询:

The properties under settings are dynamic. Essentially I need a query like this:

{
    "size": 0,
    "query": {
        "match_all": {}
    },
    "aggs": {
        "settings": {
            "terms": {
                "field": "settings.*.display"
            }
        }
    }
}

由于 * 在这里不起作用,我想知道是否有一种方法可以从简单的脚本中返回字段,然后使用管道聚合?我在JavaScript中找不到与 Object.keys(settings)相同的东西.

Since * doesn't work here, I'm wondering if there's a way to return the fields from a painless script and then maybe use a pipeline aggregation? I can't find the painless equivalent to Object.keys(settings) in JavaScript.

我已经看到了使用嵌套对象的方法,但是我想避免这种情况,因为可能会有很多设置"属性和

I've seen an approach with nested objects, but I'd like to avoid that, as there might be many 'settings' properties and the default limit is 50, compared to nested_objects with 10000 properties.

推荐答案

Object.keys()的轻松实现是 .keySet().您可以在脚本化指标agg中实现以下迭代逻辑:

The painless equivalent of Object.keys() is .keySet(). You can implement the following iterative logic in a scripted metric agg:

GET test/_search
{
  "size": 0,
  "aggs": {
    "dynamic_fields_agg": {
      "scripted_metric": {
        "init_script": "state.map = [:];",
        "map_script": """
          def source = params._source['settings'];
            for (def key : source.keySet()) {
              if (source[key].containsKey("display")) {
                 if (state.map.containsKey(key)) { 
                  state.map[key].add(source[key].display);
                 } else {
                   state.map[key] = [source[key].display];
                 }
              }
            }
        """,
        "combine_script": "return state",
        "reduce_script": "return states"
      }
    }
  }
}

这将产生类似

{
  "aggregations":{
    "dynamic_fields_agg":{
      "value":[
        {
          "map":{
            "range":[
              "200 km"
            ],
            "transmitter":[
              "1.2 Ghz"
            ]
          }
        }
      ]
    }
  }
}

现在,您可以根据需要对reduce/combine脚本中的值进行后处理.

Now you can post-process the values in the reduce/combine scripts however you like.

在这里使用嵌套字段不会给您带来太多优势-那里也不允许使用通配符路径.我前段时间问了.

Using nested fields would not bring you much advantage here -- wildcard paths are not allowed there either. I asked that myself some time ago.

UPDATE-内联版本:

UPDATE -- the inline version:

GET /test/_search
{  "size": 0,  "aggs": {    "dynamic_fields_agg": {      "scripted_metric": {        "init_script": "state.map = [:];",        "map_script": "          def source = params._source[\"settings\"];\n            for (def key : source.keySet()) {\n              if (source[key].containsKey(\"display\")) {\n                 if (state.map.containsKey(key)) { \n                  state.map[key].add(source[key].display);\n                 } else {\n                   state.map[key] = [source[key].display];\n                 }\n              }\n            }",        "combine_script": "return state",        "reduce_script": "return states"      }    }  }}

这篇关于包含动态数据/nested_objects的聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆