如何按术语汇总的嵌套doc_count对存储区进行排序? [英] How do I sort buckets by Term Aggregation's nested doc_count?

查看:105
本文介绍了如何按术语汇总的嵌套doc_count对存储区进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个索引,发票,我需要将其汇总到年度存储区中然后进行排序.

I have an index, invoices, that I need to aggregate into yearly buckets then sort.

我成功使用了Bucket Sort对桶进行简单的求和值( revenue tax )排序.但是,我正在努力按更深层嵌套的doc_count值(状态)进行排序.

I have succeeded in using Bucket Sort to sort my buckets by simple sum values (revenue and tax). However, I am struggling to sort by more deeply nested doc_count values (status).

我想不仅按收入收入来订购我的存储桶,还想按具有 status 字段等于1、2、3等的文档数量订购存储桶.

I want to order my buckets not only by revenue, but also by the number of docs with a status field equal to 1, 2, 3 etc...

索引中的文档如下:

"_source": {
  "created_at": "2018-07-07T03:11:34.327Z",
  "status": 3,
  "revenue": 68.474,
  "tax": 6.85,
}

我要求这样的汇总:

const params = {
  index: 'invoices',
  size: 0,
  body: {
    aggs: {
      sales: {
        date_histogram: {
          field: 'created_at',
          interval: 'year',
        },
        aggs: {
          total_revenue: { sum: { field: 'revenue' } },
          total_tax: { sum: { field: 'tax' } },
          statuses: {
            terms: {
              field: 'status',
            },
          },
          sales_bucket_sort: {
            bucket_sort: {
              sort: [{ total_revenue: { order: 'desc' } }],
            },
          },
        },
      },
    },
  },
}

响应(被截断)如下:

"aggregations": {
    "sales": {
        "buckets": [
            {
                "key_as_string": "2016-01-01T00:00:00.000Z",
                "key": 1451606400000,
                "doc_count": 254,
                "total_tax": {
                    "value": 735.53
                },
                "statuses": {
                    "sum_other_doc_count": 0,
                    "buckets": [
                        {
                            "key": 2,
                            "doc_count": 59
                        },
                        {
                            "key": 1,
                            "doc_count": 58
                        },
                        {
                            "key": 5,
                            "doc_count": 57
                        },
                        {
                            "key": 3,
                            "doc_count": 40
                        },
                        {
                            "key": 4,
                            "doc_count": 40
                        }
                    ]
                },
                "total_revenue": {
                    "value": 7355.376005351543
                }
            },
          ]
        }
      }

例如,我想按 key:1 进行排序.根据状态值为1的文档数量最多的存储桶进行排序.我试图对术语聚合进行排序,然后指定所需的键,如下所示:

I want to sort by key: 1, for example. Order the buckets according to which one has the greatest number of docs with a status value of 1. I tried to order my terms aggregation, then specify the desired key like this:

          statuses: {
            terms: {
              field: 'status',
              order: { _key: 'asc' },
            },
          },
          sales_bucket_sort: {
            bucket_sort: {
              sort: [{ 'statuses.buckets[0]._doc_count': { order: 'desc' } }],
            },
          },

但是,这没有用.它没有错误,似乎没有任何作用.

However this did not work. It didn't error, it just doesn't seem to have any effect.

我注意到很多年前SO上的其他人也有类似的问题,但是我希望从那以后出现一个更好的答案:

I noticed someone else on SO had a similar question many years ago, but I was hoping a better answer had emerged since then: Elasticsearch aggregation. Order by nested bucket doc_count

谢谢!

推荐答案

没关系,我想通了.我添加了一个单独的过滤器聚合,如下所示:

Nevermind I figured it out. I added a separate filter aggregation like this:

        aggs: {
          total_revamnt: { sum: { field: 'revamnt' } },
          total_purchamnt: { sum: { field: 'purchamnt' } },
          approved_invoices: {
            filter: {
              term: {
                status: 1,
              },
            },
          },

然后我就可以像这样对值进行排序:

Then I was able to bucket sort that value like this:

          sales_bucket_sort: {
            bucket_sort: {
                sort: [{ 'approved_invoices>_count': { order: 'asc' } }],
            },
          },

这篇关于如何按术语汇总的嵌套doc_count对存储区进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆