如何按术语汇总的嵌套doc_count对存储区进行排序? [英] How do I sort buckets by Term Aggregation's nested doc_count?
问题描述
我有一个索引,发票
,我需要将其汇总到年度存储区中然后进行排序.
I have an index, invoices
, that I need to aggregate into yearly buckets then sort.
我成功使用了Bucket Sort对桶进行简单的求和值( revenue
和 tax
)排序.但是,我正在努力按更深层嵌套的doc_count值(状态
)进行排序.
I have succeeded in using Bucket Sort to sort my buckets by simple sum values (revenue
and tax
). However, I am struggling to sort by more deeply nested doc_count values (status
).
我想不仅按收入收入
来订购我的存储桶,还想按具有 status
字段等于1、2、3等的文档数量订购存储桶.
I want to order my buckets not only by revenue
, but also by the number of docs with a status
field equal to 1, 2, 3 etc...
索引中的文档如下:
"_source": {
"created_at": "2018-07-07T03:11:34.327Z",
"status": 3,
"revenue": 68.474,
"tax": 6.85,
}
我要求这样的汇总:
const params = {
index: 'invoices',
size: 0,
body: {
aggs: {
sales: {
date_histogram: {
field: 'created_at',
interval: 'year',
},
aggs: {
total_revenue: { sum: { field: 'revenue' } },
total_tax: { sum: { field: 'tax' } },
statuses: {
terms: {
field: 'status',
},
},
sales_bucket_sort: {
bucket_sort: {
sort: [{ total_revenue: { order: 'desc' } }],
},
},
},
},
},
},
}
响应(被截断)如下:
"aggregations": {
"sales": {
"buckets": [
{
"key_as_string": "2016-01-01T00:00:00.000Z",
"key": 1451606400000,
"doc_count": 254,
"total_tax": {
"value": 735.53
},
"statuses": {
"sum_other_doc_count": 0,
"buckets": [
{
"key": 2,
"doc_count": 59
},
{
"key": 1,
"doc_count": 58
},
{
"key": 5,
"doc_count": 57
},
{
"key": 3,
"doc_count": 40
},
{
"key": 4,
"doc_count": 40
}
]
},
"total_revenue": {
"value": 7355.376005351543
}
},
]
}
}
例如,我想按 key:1
进行排序.根据状态值为1的文档数量最多的存储桶进行排序.我试图对术语聚合进行排序,然后指定所需的键,如下所示:
I want to sort by key: 1
, for example. Order the buckets according to which one has the greatest number of docs with a status value of 1. I tried to order my terms aggregation, then specify the desired key like this:
statuses: {
terms: {
field: 'status',
order: { _key: 'asc' },
},
},
sales_bucket_sort: {
bucket_sort: {
sort: [{ 'statuses.buckets[0]._doc_count': { order: 'desc' } }],
},
},
但是,这没有用.它没有错误,似乎没有任何作用.
However this did not work. It didn't error, it just doesn't seem to have any effect.
我注意到很多年前SO上的其他人也有类似的问题,但是我希望从那以后出现一个更好的答案:
I noticed someone else on SO had a similar question many years ago, but I was hoping a better answer had emerged since then: Elasticsearch aggregation. Order by nested bucket doc_count
谢谢!
推荐答案
没关系,我想通了.我添加了一个单独的过滤器聚合,如下所示:
Nevermind I figured it out. I added a separate filter aggregation like this:
aggs: {
total_revamnt: { sum: { field: 'revamnt' } },
total_purchamnt: { sum: { field: 'purchamnt' } },
approved_invoices: {
filter: {
term: {
status: 1,
},
},
},
然后我就可以像这样对值进行排序:
Then I was able to bucket sort that value like this:
sales_bucket_sort: {
bucket_sort: {
sort: [{ 'approved_invoices>_count': { order: 'asc' } }],
},
},
这篇关于如何按术语汇总的嵌套doc_count对存储区进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!