弹性搜索分析百分比 [英] Elasticsearch analytics percent

查看:84
本文介绍了弹性搜索分析百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Elasticsearch 1.7.3来累积分析报告的数据。



我有一个索引,其中包含文档,其中每个文档都有一个名为duration的数字字段(请求所需的毫秒数),以及一个名为component的字符串字段。



  {component:A,duration:10} 
{component:B,duration:27}
{component:A持续时间:5}
{组件:C,持续时间:2}





此组件的所有持续时间字段的总和。

  A:15 
B:27
C:2

这个总和的百分比是所有文档的总和总和。在我的例子中

  A:(10 + 5)/(10 + 27 + 5 + 2)* 100 
B:27 /(10 + 27 + 5 + 2)* 100
C:2 /(10 + 27 + 5 + 2)* 100

组件中的每个组件的文档百分比。

  A:2/4 * 100 
B:1/4 * 100
C:1/4 * 100

如何使用Elasticsearch 1.7.3?

解决方案

使用ES 1.7。 3,没有办法根据两个不同聚合的结果计算数据,这可以在ES 2.0中使用管道聚合



然而,您所要求的并不是在客户端使用1.7.3做得太复杂。如果您使用以下查询,您将获得所需的所有数据,以获取您期望的数据:

  POST组件/ _search 
{
size:0,
aggs:{
total_duration:{
sum:{
field持续时间
}
},
组件:{
条款:{
字段:组件
},
aggs:{
duration_sum:{
sum:{
field:duration
}
}
}
}
}
}

结果看起来像这个:

  {
take:1,
timed_out:false,
_shards:{
total:5,
successful:5,
failed:0
},
hits
total:4,
max_score:0,
hits:[]
},
aggregations {
total_duration:{
value:44
},
components:{
doc_count_error_upper_bound:0,
sum_other_doc_count :0,
buckets:[
{
key:a,
doc_count:2,
duration_sum:{
value:15
}
},
{
key:b,
doc_count:1,
duration_sum:{
value:27
}
},
{
key:c,
doc_count 1,
duration_sum:{
value:2
}
}
]
}
}
}

现在您需要做的只有以下几点。我使用的是JavaScript,但是可以使用任何其他可以读取JSON的语言来实现。

  var response = ...上面的JSON响应... 
var total_duration = response.aggregations.total_duration.value;
var total_docs = response.hits.total;

response.aggregations.components.buckets.forEach(function(comp_stats){
//组件的总持续时间
var total_duration_comp = comp_stats.duration_sum.value;

//组件的百分比时间
var perc_duration_comp = total_duration_comp / total_duration * 100;

//组件的百分比文档
var perc_doc_comp = comp_stats.doc_count / total_docs * 100;
});


I am using Elasticsearch 1.7.3 to accumulate data for analytics reports.

I have an index that holds documents where each document has a numeric field called 'duration' (how many milliseconds the request took), and a string field called 'component'. There can be many documents with the same component name.

Eg.

{"component": "A", "duration": 10}
{"component": "B", "duration": 27}
{"component": "A", "duration": 5}
{"component": "C", "duration": 2}

I would like to produce a report that states for each component:

The sum of all 'duration' fields for this component.

A: 15
B: 27
C: 2

The percentage of this sum out of the total sum of duration of all documents. In my example

A: (10+5) / (10+27+5+2) * 100
B: 27 / (10+27+5+2) * 100
C: 2 / (10+27+5+2) * 100

The percentage of the documents for each component, out of the total components.

A: 2 / 4 * 100
B: 1 / 4 * 100
C: 1 / 4 * 100

How do I do that with Elasticsearch 1.7.3?

解决方案

With ES 1.7.3, there is no way to compute data based on the results of two different aggregations, this is something that can be done in ES 2.0 with pipeline aggregations, though.

However, what you're asking is not too complicated to do on the client-side with 1.7.3. If you use the query below, you'll get all you need to get the figures you expect:

POST components/_search
{
   "size": 0,
   "aggs": {
      "total_duration": {
         "sum": {
            "field": "duration"
         }
      },
      "components": {
         "terms": {
            "field": "component"
         },
         "aggs": {
            "duration_sum": {
               "sum": {
                  "field": "duration"
               }
            }
         }
      }
   }
}

The results would look like this:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 4,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "total_duration": {
         "value": 44
      },
      "components": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "a",
               "doc_count": 2,
               "duration_sum": {
                  "value": 15
               }
            },
            {
               "key": "b",
               "doc_count": 1,
               "duration_sum": {
                  "value": 27
               }
            },
            {
               "key": "c",
               "doc_count": 1,
               "duration_sum": {
                  "value": 2
               }
            }
         ]
      }
   }
}

Now all you need to do would be the following. I'm using JavaScript, but you can do it in any other language that can read JSON.

var response = ...the JSON response above...
var total_duration = response.aggregations.total_duration.value;
var total_docs = response.hits.total;

response.aggregations.components.buckets.forEach(function(comp_stats) {
    // total duration for the component
    var total_duration_comp = comp_stats.duration_sum.value;

    // percentage duration of the component
    var perc_duration_comp = total_duration_comp / total_duration * 100;

    // percentage documents for the component
    var perc_doc_comp = comp_stats.doc_count / total_docs * 100;
});

这篇关于弹性搜索分析百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆