根据一个字段值过滤 elasticsearch 结果以仅包含唯一文档 [英] Filter elasticsearch results to contain only unique documents based on one field value

查看:21
本文介绍了根据一个字段值过滤 elasticsearch 结果以仅包含唯一文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的所有文档都有一个 uid 字段,其中包含一个将文档链接到用户的 ID.有多个文档具有相同的 uid.

All my documents have a uid field with an ID that links the document to a user. There are multiple documents with the same uid.

我想对所有文档执行搜索,只返回每个唯一 uid 得分最高的文档.

I want to perform a search over all the documents returning only the highest scoring document per unique uid.

选择相关文档的查询是一个简单的multi_match查询.

The query selecting the relevant documents is a simple multi_match query.

推荐答案

您需要一个 top_hits 聚合.

You need a top_hits aggregation.

对于您的具体情况:

{
  "query": {
    "multi_match": {
      ...
    }
  },
  "aggs": {
    "top-uids": {
      "terms": {
        "field": "uid"
      },
      "aggs": {
        "top_uids_hits": {
          "top_hits": {
            "sort": [
              {
                "_score": {
                  "order": "desc"
                }
              }
            ],
            "size": 1
          }
        }
      }
    }
  }
}

上面的查询确实执行您的 multi_match 查询并根据 uid 聚合结果.对于每个 uid 存储桶,它只返回一个结果,但是在存储桶中的所有文档都根据 _score 按降序排序之后.

The query above does perform your multi_match query and aggregates the results based on uid. For each uid bucket it returns only one result, but after all the documents in the bucket were sorted based on _score in descendant order.

这篇关于根据一个字段值过滤 elasticsearch 结果以仅包含唯一文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆