Elasticsearch,根据同级嵌套字段进行合计 [英] Elasticsearch, terms aggs according to sibling nested fields

查看:39
本文介绍了Elasticsearch,根据同级嵌套字段进行合计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Elasticsearch v7.5

Elasticsearch v7.5

你好,美好的一天!

我们有两个名为 socialmedia Influencers

样本内容:

社交媒体:

{
    '_id' : 1001,
    'title' : "Title 1",
    'smp_id' : 1,
    "latest" : [
        {
          "soc_mm_score" : "5",
        }
    ]
},
{
    '_id' : 1002,
    'title' : "Title 2",
    'smp_id' : 2,
    "latest" : [
        {
          "soc_mm_score" : "10",
        }
    ]
},
{
    '_id' : 1003,
    'title' : "Title 3",
    'smp_id' : 3,
    "latest" : [
        {
          "soc_mm_score" : "35",
        }
    ]
},
{
    '_id' : 1004,
    'title' : "Title 4",
    'smp_id' : 2,
    "latest" : [
        {
          "soc_mm_score" : "30",
        }
    ]
}

///省略了其他一些字段

//omitted some other fields

影响者:

{
    '_id' : 1,
    'name' : "John",
    'smp_id' : 1
},
{
    '_id' : 2,
    'name' : "Peter",
    'smp_id' : 2
},
{
    '_id' : 3,
    'name' : "Mark",
    'smp_id' : 3
}

现在,我有一个简单的查询,可确定 socialmedia 索引中的哪个文档具有最大的 latest.soc_mm_score 值,并显示它们 smp_id

Now I have this simple query that determines which documents in the socialmedia index has the most latest.soc_mm_score value, and also displaying their corresponding influencers determined by the smp_id

GET socialmedia/_search
{
  "size": 0,
  "_source": "latest", 
  "query": {
    "match_all": {}
  }, 
  "aggs": {
    "LATEST": {
      "nested": {
        "path": "latest"
      },
      "aggs": {
        "MM_SCORE": {
          "terms": {
            "field": "latest.soc_mm_score",
            "order": {
              "_key": "desc"
            },
            "size": 3
          },
          "aggs": {
            "REVERSE": {
              "reverse_nested": {},
              "aggs": {
                "SMP_ID": {
                  "top_hits": {
                    "_source": ["smp_id"], 
                    "size": 1
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

样品输出:

"aggregations" : {
    "LATEST" : {
      "doc_count" : //omitted,
      "MM_SCORE" : {
        "doc_count_error_upper_bound" : //omitted,
        "sum_other_doc_count" : //omitted,
        "buckets" : [
          {
            "key" : 35,
            "doc_count" : 1,
            "REVERSE" : {
              "doc_count" : 1,
              "SMP_ID" : {
                "hits" : {
                  "total" : {
                    "value" : 1,
                    "relation" : "eq"
                  },
                  "max_score" : 1.0,
                  "hits" : [
                    {
                      "_index" : "socialmedia",
                      "_type" : "index",
                      "_id" : "1003",
                      "_score" : 1.0,
                      "_source" : {
                        "smp_id" : "3"
                      }
                    }
                  ]
                }
              }
            }
          },
          {
            "key" : 30,
            "doc_count" : 1,
            "REVERSE" : {
              "doc_count" : 1,
              "SMP_ID" : {
                "hits" : {
                  "total" : {
                    "value" : 1,
                    "relation" : "eq"
                  },
                  "max_score" : 1.0,
                  "hits" : [
                    {
                      "_index" : "socialmedia",
                      "_type" : "index",
                      "_id" : "1004",
                      "_score" : 1.0,
                      "_source" : {
                        "smp_id" : "2"
                      }
                    }
                  ]
                }
              }
            }
          },
          {
            "key" : 10,
            "doc_count" : 1,
            "REVERSE" : {
              "doc_count" : 1,
              "SMP_ID" : {
                "hits" : {
                  "total" : {
                    "value" : 1,
                    "relation" : "eq"
                  },
                  "max_score" : 1.0,
                  "hits" : [
                    {
                      "_index" : "socialmedia",
                      "_type" : "index",
                      "_id" : "1002",
                      "_score" : 1.0,
                      "_source" : {
                        "smp_id" : "2"
                      }
                    }
                  ]
                }
              }
            }
          }
        ]
      }
    }
  }

通过上面的查询,我能够成功显示哪些文档具有最高的 latest.soc_mm_score

with the query above, I was able to successfully display which documents have the highest latest.soc_mm_score values

上面的示例输出仅显示 DOCUMENTS ,并根据 latest.soc_mm_score 告诉与他们相关的影响者(aka smp_id)是TOP INFLUENCERS.strong>

The sample output above only displays DOCUMENTS, telling that the influencers (a.k.a smp_id) related to them are the TOP INFLUENCERS according to latest.soc_mm_score

理想情况下,只需使用此aggs查询

Ideally just by using this aggs query,

"terms" : {
    "field" : "smp_id"
}

根据 doc_count

现在,根据 latest.soc_mm_score 显示字词查询会显示 TOP DOCUMENTS

Now, displaying the terms query according to latest.soc_mm_score displays TOP DOCUMENTS

"terms" : {
    "field" : "latest.soc_mm_score"
}

目标目标:

我想根据 socialmedia 索引中的 latest.soc_mm_count 显示 TOP INFLUENCERS .如果Elasticsearch可以根据唯一的smp_id计算所有文档的位置,那么ES是否有办法汇总所有 latest.soc_mm_score 值并将其用作条款?

I want to display the TOP INFLUENCERS according to the latest.soc_mm_count in the socialmedia index. If Elasticsearch can count all the documents where according to unique smp_id, is there a way for ES to sum all latest.soc_mm_score values and use it as terms?

我上面的目标应该输出以下内容:

My objective above should output these:

  • smp_id 2成为最有影响力的人,因为他有2个帖子(soc_mm_score分别为30和10),加上这些帖子后他的排名为40 soc_mm_score
  • smp_id 3作为第二位杰出影响者,他拥有1个职位,得分为35 soc_mm_score
  • smp_id 1作为第三大影响者,他有1个帖子,还有5个soc_mm_score

是否有适当的查询来实现此目标?

Is there a proper query to meet this objective?

推荐答案

最后!找到答案!!!

FINALLY! FOUND AN ANSWER!!!

"aggs": {
    "INFS": {
      "terms": {
        "field": "smp_id.keyword",
        "order": {
          "LATEST > SUM_SVALUE": "desc"
        }
      },
      "aggs": {
        "LATEST": {
          "nested": {
            "path": "latest"
          },
          "aggs": {
            "SUM_SVALUE": {
              "sum" : {
                "field": "latest.soc_mm_score"
              }
            }
          }
        }
      }
    }
}

显示以下示例:

这篇关于Elasticsearch,根据同级嵌套字段进行合计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆