在汇总值elasticsearch数组 [英] Aggregating array of values in elasticsearch

查看:237
本文介绍了在汇总值elasticsearch数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要如下聚合数组

两个文件的例子:

{
    "_index": "log",
    "_type": "travels",
    "_id": "tnQsGy4lS0K6uT3Hwzzo-g",
    "_score": 1,
    "_source": {
        "state": "saopaulo",
        "date": "2014-10-30T17",
        "traveler": "patrick",
        "registry": "123123",
        "cities": {
            "saopaulo": 1,
            "riodejaneiro": 2,
            "total": 2
        },
        "reasons": [
            "Entrega de encomenda"
        ],
        "from": [
            "CompraRapida"
        ]
    }
},
{
    "_index": "log",
    "_type": "travels",
    "_id": "tnQsGy4lS0K6uT3Hwzzo-g",
    "_score": 1,
    "_source": {
        "state": "saopaulo",
        "date": "2014-10-31T17",
        "traveler": "patrick",
        "registry": "123123",
        "cities": {
            "saopaulo": 1,
            "curitiba": 1,
            "total": 2
        },
        "reasons": [
            "Entrega de encomenda"
        ],
        "from": [
            "CompraRapida"
        ]
    }
},

我要聚集城市阵列,找出所有的城市旅行者去了。我想是这样的:

I want to aggregate the cities array, to find out all the cities the traveler has gone to. I want something like this:

{
    "traveler":{
        "name":"patrick"
    },
    "cities":{
        "saopaulo":2,
        "riodejaneiro":2,
        "curitiba":1,
        "total":3
    }
}

城市的长度阵列减1。我试图条款聚集和总和,但不能输出所需的输出。

Where the total is the length of the cities array minus 1. I tried the terms aggregation and the sum, but couldn't output the desired output.

在文档结构的变化可以做成的,所以如果这样的事情会帮助我,我很高兴知道。

Changes in the document structure can be made, so if anything like that would help me, I'd be pleased to know.

推荐答案

在文件中公布上述 城市不是一个JSON阵列,它是一个JSON对象。
如果改变文件结构是一个可能性我会改变城市文档中成为对象的阵列

in the document posted above "cities" is not a json array , it is a json object. If changing the document structure is a possibility I would change cities in the document to be an array of object

例如文档:

 cities : [
   {
     "name" :"saopaulo"
     "visit_count" :"2",

   },
   {
     "name" :"riodejaneiro"
     "visit_count" :"1",

   }

]

您会再需要设置城市是类型的<一个href=\"http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html#_mapping\">nested在索引映射

You would then need to set cities to be of type nested in the index mapping

   "mappings": {
         "<type_name>": {
            "properties": {
               "cities": {
                  "type": "nested",
                  "properties": {
                     "city": {
                        "type": "string"
                     },
                     "count": {
                        "type": "integer"
                     },
                     "value": {
                        "type": "long"
                     }
                  }
               },
               "date": {
                  "type": "date",
                  "format": "dateOptionalTime"
               },
               "registry": {
                  "type": "string"
               },
               "state": {
                  "type": "string"
               },
               "traveler": {
                  "type": "string"
               }
            }
         }
      }

在这之后,你可以使用<一个href=\"http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-nested-aggregation.html\">nested聚集来获取每个用户的城市数量。
查询将看起来是在这些线路上:

After which you could use nested aggregation to get the city count per user. The query would look something on these lines :

{
   "query": {
      "match": {
         "traveler": "patrick"
      }
   },
   "aggregations": {
      "city_travelled": {
         "nested": {
            "path": "cities"
         },
         "aggs": {
            "citycount": {
               "cardinality": {
                  "field": "cities.city"
               }
            }
         }
      }
   }
}

这篇关于在汇总值elasticsearch数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆