范围ElasticSearch聚合 [英] Range ElasticSearch Aggregation

查看:56
本文介绍了范围ElasticSearch聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在ElasticSearch中计算管道聚合,但我不知道如何表达它.

I need to compute a pipeline aggregation in ElasticSearch and I can't figure out how to express it.

每个文档都有一个电子邮件地址和一定数量.我需要输出金额计数的范围存储桶,并按唯一的电子邮件进行分组.

Each document has an email address and an amount. I need to output range buckets of amount counts, grouped by unique email.

{ "0 - 99": 300, "100 - 400": 100 ...}

基本上将是预期的输出(密钥将在我的应用程序代码中转换),表明300份唯一的电子邮件已在所有文档中累计收到至少99(金额).

Would basically be the expected output (the keys would be transformed in my application code), indicating that 300 unique emails have cumulatively received at least 99 (amount) across all documents.

直觉上,我期望像下面这样的查询.但是,范围似乎不是存储桶聚合(或允许buckets_path).

Intuitively, I would expect a query like below. However, range does not appear to be a buckets aggregation (or allow buckets_path).

这里正确的方法是什么?

What is the correct approach here?

{
 aggs: {
   users: {
     terms: {
       field: "email"
     },
     aggs: {
       amount_received: {
         sum: {
           field: "amount"
         }
       }
     }
   },
   amount_ranges: {
     range: {
       buckets_path: "users>amount_received",
       ranges: [
           { to: 99.0 },
           { from: 100.0, to: 299.0 },
           { from: 300.0, to: 599.0 },
           { from: 600.0 }
       ]
     }
   }
}
  }

推荐答案

没有管道聚合可以直接做到这一点.但是,我想我想出了一个适合您需求的解决方案,它就像这样.想法是重复相同的 terms/sum 聚合,然后对您感兴趣的每个范围使用 bucket_selector 管道聚合.

There's no pipeline aggregation that does that directly. However, I think I came up with a solution that should suit your needs, it goes like this. The idea is to repeat the same terms/sum aggregation and then use a bucket_selector pipeline aggregation for each of the ranges you're interested in.

POST index/_search
{
  "size": 0,
  "aggs": {
    "users_99": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "-99": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived < 100"
          }
        }
      }
    },
    "users_100_299": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "100-299": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 100 && params.amountReceived < 300"
          }
        }
      }
    },
    "users_300_599": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "300-599": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 300 && params.amountReceived < 600"
          }
        }
      }
    },
    "users_600": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "600": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 600"
          }
        }
      }
    }
  }
}

结果中, users_99 中的存储桶数将是数量少于99的唯一电子邮件的数量.类似地, users_100_299 将包含尽可能多的存储桶.桶,因为有一些独特的电子邮件,数量在100到300之间.依此类推...

In the results, the number of buckets in the users_99 will be the number of unique emails that have an amount less than 99. Similarly, users_100_299 will contain as many buckets as there are unique emails with amounts between 100 and 300. And so on...

这篇关于范围ElasticSearch聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆