ElasticSearch 按多个字段分组 [英] ElasticSearch group by multiple fields

查看:319
本文介绍了ElasticSearch 按多个字段分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现的唯一接近的是:Elasticsearch 中的多个分组

The only close thing that I've found was: Multiple group-by in Elasticsearch

基本上,我正在尝试获得与以下 MySql 查询等效的 ES:

Basically I'm trying to get the ES equivalent of the following MySql query:

select gender, age_range, count(distinct profile_id) as count 
FROM TABLE group by age_range, gender

年龄和性别本身很容易得到:

The age and gender by themselves were easy to get:

{
  "query": {
    "match_all": {}
  },
  "facets": {
    "ages": {
      "terms": {
        "field": "age_range",
        "size": 20
      }
    },
    "gender_by_age": {
      "terms": {
        "fields": [
          "age_range",
          "gender"
        ]
      }
    }
  },
  "size": 0
}

给出:

{
  "ages": {
    "_type": "terms",
    "missing": 0,
    "total": 193961,
    "other": 0,
    "terms": [
      {
        "term": 0,
        "count": 162643
      },
      {
        "term": 3,
        "count": 10683
      },
      {
        "term": 4,
        "count": 8931
      },
      {
        "term": 5,
        "count": 4690
      },
      {
        "term": 6,
        "count": 3647
      },
      {
        "term": 2,
        "count": 3247
      },
      {
        "term": 1,
        "count": 120
      }
    ]
  },
  "total_gender": {
    "_type": "terms",
    "missing": 0,
    "total": 193961,
    "other": 0,
    "terms": [
      {
        "term": 1,
        "count": 94799
      },
      {
        "term": 2,
        "count": 62645
      },
      {
        "term": 0,
        "count": 36517
      }
    ]
  }
}

但现在我需要看起来像这样的东西:

But now I need something that looks like this:

[breakdown_gender] => Array
    (
        [1] => Array
            (
                [0] => 264
                [1] => 1
                [2] => 6
                [3] => 67
                [4] => 72
                [5] => 40
                [6] => 23
            )

        [2] => Array
            (
                [0] => 153
                [2] => 2
                [3] => 21
                [4] => 35
                [5] => 22
                [6] => 11
            )

    )

请注意,0,1,2,3,4,5,6 是年龄范围的映射",因此它们实际上意味着一些 :) 而不仅仅是数字.例如性别 [1](即男性")细分为年龄范围 [0](即18 岁以下"),计数为 246.

Please note that 0,1,2,3,4,5,6 are "mappings" for the age ranges so they actually mean something :) and not just numbers. e.g. Gender[1] (which is "male") breaks down into age range [0] (which is "under 18") with a count of 246.

推荐答案

由于您只有 2 个字段,因此一种简单的方法是使用单个方面执行两个查询.男性:

As you only have 2 fields a simple way is doing two queries with single facets. For Male:

{
    "query" : {
      "term" : { "gender" : "Male" }
    },
    "facets" : {
        "age_range" : {
            "terms" : {
                "field" : "age_range"
            }
        }
    }
}

对于女性:

{
    "query" : {
      "term" : { "gender" : "Female" }
    },
    "facets" : {
        "age_range" : {
            "terms" : {
                "field" : "age_range"
            }
        }
    }
}

或者您可以使用分面过滤器在单个查询中执行此操作(请参阅此链接 了解更多信息)

Or you can do it in a single query with a facet filter (see this link for further information)

{
    "query" : {
       "match_all": {}
    },
    "facets" : {
        "age_range_male" : {
            "terms" : {
                "field" : "age_range"
            },
            "facet_filter":{
                "term": {
                    "gender": "Male"
                }
            }
        },
        "age_range_female" : {
            "terms" : {
                "field" : "age_range"
            },
            "facet_filter":{
                "term": {
                    "gender": "Female"
                }
            }
        }
    }
}

更新:

由于刻面即将被移除.这是聚合的解决方案:

As facets are about to be removed. This is the solution with aggregations:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "male": {
      "filter": {
        "term": {
          "gender": "Male"
        }
      },
      "aggs": {
        "age_range": {
          "terms": {
            "field": "age_range"
          }
        }
      }
    },
    "female": {
      "filter": {
        "term": {
          "gender": "Female"
        }
      },
      "aggs": {
        "age_range": {
          "terms": {
            "field": "age_range"
          }
        }
      }
    }
  }
}

这篇关于ElasticSearch 按多个字段分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆