计算子聚合返回的存储桶 [英] Count buckets returned by sub aggregation
问题描述
我需要计算管道聚合。问题是我的查询正在使用脚本选择器在这里:
I need to count the number of buckets from a result set returned by pipe aggregation. Problem is that my query that is using script selector here:
POST visitor_carts/_search
{
"size": 0,
"aggs": {
"visitors": {
"terms": {"field" : "visitor_id"},
"aggs": {
"one_purchase": {
"bucket_selector": {
"buckets_path": {
"nb_purchases": "_count"
},
"script": "params.nb_purchases == 3"
}
}
}
}
}
}
返回类似的内容:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"visitors" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "2",
"doc_count" : 3
},
{
"key" : "3",
"doc_count" : 3
}
]
}
}
}
在桶下
密钥我可以看到符合我条件的访客列表(每个由 visitor_id
标识的访客在 visitor_carts中必须正好包含三个文档
索引),但这并不是很有用,因为它应该可以处理成千上万的访问者。我正在使用PHP处理结果,从理论上讲,它可以计算结果集,但是对于大量的访问者来说,这并不是最好的主意。有没有一种方法可以只输出 doc_count_error_upper_bound
和 sum_other_doc_count
旁边的有效存储区数?奇怪的是,汇总统计信息中没有包含 bucket_count
,因为它似乎很有用。
Under the buckets
key I can see a list of visitors that meet my condition (every visitor identified by visitor_id
must have exactly three documents in the visitor_carts
index) but that is not very helpful because it should handle instead hundreds of thousands of visitors. I am using PHP to handle the results, theoretically it could count the result set but with a large volume of visitors it feels like not the best idea. Is there a way to just output the count of valid buckets next to doc_count_error_upper_bound
and sum_other_doc_count
? It is a little odd that there is no bucket_count
included in the aggregation stats as it seems to be quite useful.
或者也许可以通过其他方式完成?此问题是此问题的后续内容:获取进行了特定购买次数的用户计数
Or maybe this can be done in a different way? This question is a follow-up for this one: Get user count that made a specific number of purchases
这是我的 visitor_carts
映射:
{
"mapping": {
"_doc": {
"dynamic": "false",
"properties": {
"created_dt": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"order_id": {
"type": "keyword"
},
"visitor_id": {
"type": "keyword"
}
}
}
}
}
推荐答案
您可以使用统计存储桶聚合以获取存储桶数。
You can make use of Stats Bucket Aggregation to get the count of buckets.
下面是查询的样子。
POST visitor_carts/_search
{
"size": 0,
"aggs": {
"visitors": {
"terms": {
"field" : "visitor_id"
},
"aggs": {
"one_purchase": {
"bucket_selector": {
"buckets_path": {
"nb_purchases": "_count"
},
"script": "params.nb_purchases == 3"
}
}
}
},
"mybucketcount":{
"stats_bucket": {
"buckets_path":"visitors._count"
}
}
}
}
聚合结果:
Aggregation Result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 8,
"max_score": 0,
"hits": []
},
"aggregations": {
"visitors": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "2",
"doc_count": 3
},
{
"key": "3",
"doc_count": 3
}
]
},
"mybucketcount": {
"count": 2, <---- This is the count you are looking for
"min": 3,
"max": 3,
"avg": 3,
"sum": 6
}
}
}
让我知道如果有帮助的话!
Let me know if this helps!
这篇关于计算子聚合返回的存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!