ElasticSearch术语汇总后如何返回每个存储桶的所有文档? [英] How to return all documents for each bucket after ElasticSearch term aggregation?

查看:72
本文介绍了ElasticSearch术语汇总后如何返回每个存储桶的所有文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用以下简单查询在弹性索引中的文档中进行搜索:

I use the following simple query to search across documents in my Elastic index:

{
    "query": { "query_string": { "query": "*test*" } },
    "aggregations": {
        "myaggregation": {
            "terms": { "field": "myField.raw", "size": 0 }
        }
    }
}

这返回给我 myField.raw 的每个不同值的文档数.

This returns me the number of documents per distinct value of myField.raw.

由于我对除实际总数以外的所有所有文件感兴趣,因此我尝试添加以下 top_hits 子集合:

Since I'm interested into all actual documents than the total number, I tried to add the following top_hits sub aggregation:

{
    "query": { "query_string": { "query": "*test*" } },
    "aggregations": {
        "myaggregation": {
            "terms": { "field": "myField.raw", "size": 0 },
            "aggregations": {
                "hits": {
                    "top_hits": { "size": 2000000 }
                }
            }
        }
    }
}

这种对 top_hits 的丑陋用法是可行的,但速度慢得要命.

This ugly usage of top_hits works, but is slow as hell.

进行 term 聚合后,是否有任何适当的方法来获取每个存储桶的实际文档?

Is there any proper way to fetch the actual documents for each bucket after doing the term aggregation?

推荐答案

您是否考虑过在 field 上使用 collapse ?

Have you considered using collapse on field?

它返回归类为inner_hits( hits.hits [].inner_hits.< collapse-group-name> .hits.hits [] ._ source )下的doc

It returns doc grouped under inner_hits (hits.hits[].inner_hits.<collapse-group-name>.hits.hits[]._source)

引荐- https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-collapse.html

这篇关于ElasticSearch术语汇总后如何返回每个存储桶的所有文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆