选择不同的bool查询弹性搜索值 [英] Select distinct values of bool query elastic search

查看:114
本文介绍了选择不同的bool查询弹性搜索值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个查询,让我从弹性索引中获取一些用户的帖子数据。我对这个查询感到满意,尽管我需要使用唯一的用户名返回行。目前,它显示用户的相关帖子,但可能会显示一个用户两次。

I have a query that gets me some user post data from an elastic index. I am happy with that query, though I need to make it return rows with unique usernames. Current, it displays relevant posts by users, but it may display one user twice..

{
          "query": {
            "bool": {
              "should": [
                          { "match_phrase": { "gtitle": {"query": "voice","boost": 1}}},
                          { "match_phrase": { "gdesc": {"query": "voice","boost": 1}}},
                          { "match": { "city": {"query": "voice","boost": 2}}},
                          { "match": { "gtags": {"query": "voice","boost": 1}   }}
              ],"must_not": [
                          { "term": { "profilepicture": ""}}
              ],"minimum_should_match" : 1
            }
          }
}

我已经阅读了关于聚合,但不太了解(也尝试使用aggs但didn任何帮助都被赞赏

I have read about aggregations but didn't understand much (also tried to use aggs but didn't work either).... any help is appreciated

推荐答案

您需要使用术语聚合,以获得所有唯一用户,然后使用顶部点击聚合,以获得每个用户只有一个结果。这是它的外观。

You would need to use terms aggregation to get all unique users and then use top hits aggregation to get only one result for each user. This is how it looks.

{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "gtitle": {
              "query": "voice",
              "boost": 1
            }
          }
        },
        {
          "match_phrase": {
            "gdesc": {
              "query": "voice",
              "boost": 1
            }
          }
        },
        {
          "match": {
            "city": {
              "query": "voice",
              "boost": 2
            }
          }
        },
        {
          "match": {
            "gtags": {
              "query": "voice",
              "boost": 1
            }
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "profilepicture": ""
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "aggs": {
    "unique_user": {
      "terms": {
        "field": "userid",
        "size": 100
      },
      "aggs": {
        "only_one_post": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  },
  "size": 0
}

这里 size 内部用户聚合是100,你可以增加,如果你有更多的唯一用户(默认是10),最外面的大小是零仅获得聚合结果。要记住的一个重要的事情是您的用户ID必须是唯一的,即 ABC abc 将被视为不同的用户,您可能需要使您的userid not_analyzed 以确保这一点。 更多

Here size inside user aggregation is 100, you can increase that if you have more unique users(default is 10), also the outermost size is zero to get only aggregation results. One important thing to remember is your user ids have to be unique, i.e ABC and abc will be considered different users, you might have to make your userid not_analyzed to be sure about that. More on that.

希望这有帮助!!

这篇关于选择不同的bool查询弹性搜索值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆