Elasticsearch排除字段值最高的问题 [英] Elasticsearch exclude top hit on field value

查看:83
本文介绍了Elasticsearch排除字段值最高的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

{'country': 'France', 'collected': '2018-03-12', 'active': true}
{'country': 'France', 'collected': '2018-03-13', 'active': true}
{'country': 'France', 'collected': '2018-03-14', 'active': false}
{'country': 'Canada', 'collected': '2018-02-01', 'active': false}
{'country': 'Canada', 'collected': '2018-02-02', 'active': true}

假设我有这个结果集,并且我想按国家分组。将它们按国家分组后将是结果:

Let's say I have this resultset, and I want to group them by country. After grouping them by country this will be the result:

{'country': 'France', 'collected': '2018-03-14', 'active': false}
{'country': 'Canada', 'collected': '2018-02-02', 'active': true}

但我想排除最后一行 active 为<$ c的结果$ c> false (只要最后一行等于true,同一国家的较早的行可以为true或false无关紧要),如何在Elasticsearch中做到这一点?这是我的查询:

But I want to exclude results where the last row active is false (the older rows of the same country can be true or false doesn't matter as long as the last row equals true), how can I do that in elasticsearch? Here is my query:

POST /test/_search?search_type=count
{
    "aggs": {
        "group": {
            "terms": {
                "field": "country"
            },
            "aggs": {
                "group_docs": {
                    "top_hits": {
                        "size": 1,
                        "sort": [
                            {
                                "collected": {
                                    "order": "desc"
                                }
                            }
                        ]
                    }
                }
            }
        }
    }
}


推荐答案

我认为您可以摆脱排序通过 top_hits 中的两个字段:通过活动和通过已收集。基本上,您希望 true 首先出现且相等时,然后按 collected 进行排序。诸如此类的内容始终会显示 active:true 文档,该文档按收集的排序。

I think you can get away with sorting by two fields in your top_hits: by active and by collected. Basically, you want trues to be first and when equal, then sort by collected. Something like the following will always show the active:true documents sorted by collected.

此解决方案的唯一缺点是,如果您没有任何活动文档,则 top_hits 将显示一个活动文档:错误的文档。

The only downside to this solution is that if you don't have any active documents, top_hits will show one active:false document.

{
  "size": 0,
  "aggs": {
    "group": {
      "terms": {
        "field": "country"
      },
      "aggs": {
        "group_docs": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "active": {
                  "order": "desc"
                }, 
                "collected": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

这篇关于Elasticsearch排除字段值最高的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆