Elasticsearch 分析百分比 [英] Elasticsearch analytics percent

查看:49
本文介绍了Elasticsearch 分析百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 Elasticsearch 1.7.3 为分析报告积累数据.

I am using Elasticsearch 1.7.3 to accumulate data for analytics reports.

我有一个包含文档的索引,其中每个文档都有一个名为duration"的数字字段(请求花费了多少毫秒)和一个名为component"的字符串字段.可以有多个具有相同组件名称的文档.

I have an index that holds documents where each document has a numeric field called 'duration' (how many milliseconds the request took), and a string field called 'component'. There can be many documents with the same component name.

例如

{"component": "A", "duration": 10}
{"component": "B", "duration": 27}
{"component": "A", "duration": 5}
{"component": "C", "duration": 2}

我想制作一份报告,说明每个组件:

I would like to produce a report that states for each component:

此组件的所有持续时间"字段的总和.

The sum of all 'duration' fields for this component.

A: 15
B: 27
C: 2

此总和占所有文档的持续时间总和的百分比.在我的例子中

The percentage of this sum out of the total sum of duration of all documents. In my example

A: (10+5) / (10+27+5+2) * 100
B: 27 / (10+27+5+2) * 100
C: 2 / (10+27+5+2) * 100

每个组件的文档占总组件的百分比.

The percentage of the documents for each component, out of the total components.

A: 2 / 4 * 100
B: 1 / 4 * 100
C: 1 / 4 * 100

如何使用 Elasticsearch 1.7.3 做到这一点?

How do I do that with Elasticsearch 1.7.3?

推荐答案

在 ES 1.7.3 中,无法根据两种不同聚合的结果计算数据,这在 ES 2.0 中可以使用不过,管道聚合.

With ES 1.7.3, there is no way to compute data based on the results of two different aggregations, this is something that can be done in ES 2.0 with pipeline aggregations, though.

但是,您所要求的在客户端使用 1.7.3 并不太复杂.如果您使用下面的查询,您将获得获得预期数字所需的一切:

However, what you're asking is not too complicated to do on the client-side with 1.7.3. If you use the query below, you'll get all you need to get the figures you expect:

POST components/_search
{
   "size": 0,
   "aggs": {
      "total_duration": {
         "sum": {
            "field": "duration"
         }
      },
      "components": {
         "terms": {
            "field": "component"
         },
         "aggs": {
            "duration_sum": {
               "sum": {
                  "field": "duration"
               }
            }
         }
      }
   }
}

结果如下:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 4,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "total_duration": {
         "value": 44
      },
      "components": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "a",
               "doc_count": 2,
               "duration_sum": {
                  "value": 15
               }
            },
            {
               "key": "b",
               "doc_count": 1,
               "duration_sum": {
                  "value": 27
               }
            },
            {
               "key": "c",
               "doc_count": 1,
               "duration_sum": {
                  "value": 2
               }
            }
         ]
      }
   }
}

现在您需要做的就是以下内容.我使用的是 JavaScript,但您可以使用任何其他可以读取 JSON 的语言来实现.

Now all you need to do would be the following. I'm using JavaScript, but you can do it in any other language that can read JSON.

var response = ...the JSON response above...
var total_duration = response.aggregations.total_duration.value;
var total_docs = response.hits.total;

response.aggregations.components.buckets.forEach(function(comp_stats) {
    // total duration for the component
    var total_duration_comp = comp_stats.duration_sum.value;

    // percentage duration of the component
    var perc_duration_comp = total_duration_comp / total_duration * 100;

    // percentage documents for the component
    var perc_doc_comp = comp_stats.doc_count / total_docs * 100;
});

这篇关于Elasticsearch 分析百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆