计算子聚合返回的存储桶 [英] Count buckets returned by sub aggregation

查看:82
本文介绍了计算子聚合返回的存储桶的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要计算管道聚合。问题是我的查询正在使用脚本选择器在这里:

I need to count the number of buckets from a result set returned by pipe aggregation. Problem is that my query that is using script selector here:

POST visitor_carts/_search
{
  "size": 0,
  "aggs": {
    "visitors": {
      "terms": {"field" : "visitor_id"},
      "aggs": {
        "one_purchase": {
          "bucket_selector": {
            "buckets_path": {
              "nb_purchases": "_count"
            },
            "script": "params.nb_purchases == 3"
          }
        }
      }
    }
  }
}

返回类似的内容:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "visitors" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "2",
          "doc_count" : 3
        },
        {
          "key" : "3",
          "doc_count" : 3
        }
      ]
    }
  }
}

桶下密钥我可以看到符合我条件的访客列表(每个由 visitor_id 标识的访客在 visitor_carts中必须正好包含三个文档索引),但这并不是很有用,因为它应该可以处理成千上万的访问者。我正在使用PHP处理结果,从理论上讲,它可以计算结果集,但是对于大量的访问者来说,这并不是最好的主意。有没有一种方法可以只输出 doc_count_error_upper_bound sum_other_doc_count 旁边的有效存储区数?奇怪的是,汇总统计信息中没有包含 bucket_count ,因为它似乎很有用。

Under the buckets key I can see a list of visitors that meet my condition (every visitor identified by visitor_id must have exactly three documents in the visitor_carts index) but that is not very helpful because it should handle instead hundreds of thousands of visitors. I am using PHP to handle the results, theoretically it could count the result set but with a large volume of visitors it feels like not the best idea. Is there a way to just output the count of valid buckets next to doc_count_error_upper_bound and sum_other_doc_count? It is a little odd that there is no bucket_count included in the aggregation stats as it seems to be quite useful.

或者也许可以通过其他方式完成?此问题是此问题的后续内容:获取进行了特定购买次数的用户计数

Or maybe this can be done in a different way? This question is a follow-up for this one: Get user count that made a specific number of purchases

这是我的 visitor_carts 映射:

{
  "mapping": {
    "_doc": {
      "dynamic": "false",
      "properties": {
        "created_dt": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        },
        "order_id": {
          "type": "keyword"
        },
        "visitor_id": {
          "type": "keyword"
        }
      }
    }
  }
}


推荐答案

您可以使用统计存储桶聚合以获取存储桶数。

You can make use of Stats Bucket Aggregation to get the count of buckets.

下面是查询的样子。

POST visitor_carts/_search
{
  "size": 0,
  "aggs": {
    "visitors": {
      "terms": {
        "field" : "visitor_id"
      },
      "aggs": {
        "one_purchase": {
          "bucket_selector": {
            "buckets_path": {
              "nb_purchases": "_count"
            },
            "script": "params.nb_purchases == 3"
          }
        }
      }
    },
    "mybucketcount":{
      "stats_bucket": {
        "buckets_path":"visitors._count"
      }
    }
  }
}



聚合结果:



Aggregation Result:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "visitors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "2",
          "doc_count": 3
        },
        {
          "key": "3",
          "doc_count": 3
        }
      ]
    },
    "mybucketcount": {
      "count": 2,              <---- This is the count you are looking for
      "min": 3,
      "max": 3,
      "avg": 3,
      "sum": 6
    }
  }
}

让我知道如果有帮助的话!

Let me know if this helps!

这篇关于计算子聚合返回的存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆