ElasticSearch由多个字段组成 [英] ElasticSearch group by multiple fields

查看:268
本文介绍了ElasticSearch由多个字段组成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现的唯一紧密的事情是: Elasticsearch中的多个group-by



基本上我试图获得与以下mysql查询相当的ES:

 选择gender,age_range,count(distinct profile_id)as count FROM TABLE group by age_range,gender 

自己的年龄和性别很容易得到:

  {
query:{
match_all:{}
},
facets:{
ages:{
terms:{
:age_range,
size:20
}
},
gender_by_age:{
terms:{
fields :[
age_range,
gender
]
}
}
},
size:0
}

其中:

  {
年龄: {
_type:terms,
missing:0,
total:193961,
other:0,
terms :[
{
term:0,
count:162643
},
{
term:3,
count:10683
},
{
term:4,
count:8931
},
{
term:5,
count:4690
},
{
term:6,
count:3647
},
{
term:2,
count:3247
},
{
term:1,
count:120
}
]
},
total_gender:{
_type:terms,
missing :0,
total:193961,
other:0,
terms:[
{
term:1,
count:94799
},
{
term:2,
count:62645
},
{
term:0,
count:36517
}
]
}
}
但是现在我需要的东西看起来像这样:

  [breakdown_gender] => Array 

[1] => Array

[0] => 264
[1] => 1
[2 ] => 6
[3] => 67
[4] => 72
[5] => 40
[6] =& b $ b)

[2] => Array

[0] => 153
[2] => 2
[3] => 21
[4] => 35
[5] => 22
[6] => 11



请注意, 0,1,2,3, 4,5,6 是年龄范围的映射,所以他们实际上意味着什么:)而不只是数字。例如性别[1](这是男性)分解为年龄段[0](即18岁以下),计数为246。

解决方案

由于您只有两个字段,一个简单的方法是使用单个方面进行两个查询。对于男性:

  {
查询:{
term:{gender $

facets:{
age_range:{
terms:{
field:age_range
}
}
}
}

而对于女性:

  {
查询:{
term:{gender
},
facets:{
age_range:{
terms:{
field:age_range
}
}
}
}

或者你可以做到在具有构面过滤器的单个查询中(有关详细信息,请参阅此链接

  {
查询:{
match_all:{}
},
facet:{
age_range_male:{
terms: {
field:age_range
},
facet_filter:{
term:{
gender:Male

}
},
age_range_female:{
terms:{
field:age_range
},
facet_filter:{
term:{
gender:女
}
}
}
}
}

更新:



即将被删除。这是具有汇总的解决方案:

  {
query:{
match_all:{ }
},
aggs:{
male:{
filter:{
term:{
gender :$


aggs:{
age_range:{
terms:{
field $



女:{
过滤器:{
term: {
gender:女
}
},
aggs:{
age_range:{
terms
field:age_range
}
}
}
}
}
}


The only close thing that I've found was: Multiple group-by in Elasticsearch

Basically I'm trying to get the ES equivalent of the following mysql query:

select gender, age_range, count(distinct profile_id) as count FROM TABLE group by age_range, gender

The age and gender by themselves were easy to get:

{
  "query": {
    "match_all": {}
  },
  "facets": {
    "ages": {
      "terms": {
        "field": "age_range",
        "size": 20
      }
    },
    "gender_by_age": {
      "terms": {
        "fields": [
          "age_range",
          "gender"
        ]
      }
    }
  },
  "size": 0
}

which gives:

{
  "ages": {
    "_type": "terms",
    "missing": 0,
    "total": 193961,
    "other": 0,
    "terms": [
      {
        "term": 0,
        "count": 162643
      },
      {
        "term": 3,
        "count": 10683
      },
      {
        "term": 4,
        "count": 8931
      },
      {
        "term": 5,
        "count": 4690
      },
      {
        "term": 6,
        "count": 3647
      },
      {
        "term": 2,
        "count": 3247
      },
      {
        "term": 1,
        "count": 120
      }
    ]
  },
  "total_gender": {
    "_type": "terms",
    "missing": 0,
    "total": 193961,
    "other": 0,
    "terms": [
      {
        "term": 1,
        "count": 94799
      },
      {
        "term": 2,
        "count": 62645
      },
      {
        "term": 0,
        "count": 36517
      }
    ]
  }
}

But now I need something that looks like this:

[breakdown_gender] => Array
    (
        [1] => Array
            (
                [0] => 264
                [1] => 1
                [2] => 6
                [3] => 67
                [4] => 72
                [5] => 40
                [6] => 23
            )

        [2] => Array
            (
                [0] => 153
                [2] => 2
                [3] => 21
                [4] => 35
                [5] => 22
                [6] => 11
            )

    )

Please note that 0,1,2,3,4,5,6 are "mappings" for the age ranges so they actually mean something :) and not just numbers. e.g. Gender[1] (which is "male") breaks down into age range [0] (which is "under 18") with a count of 246.

解决方案

As you only have 2 fields a simple way is doing two queries with single facets. For Male:

{
    "query" : {
      "term" : { "gender" : "Male" }
    },
    "facets" : {
        "age_range" : {
            "terms" : {
                "field" : "age_range"
            }
        }
    }
}

And for female:

{
    "query" : {
      "term" : { "gender" : "Female" }
    },
    "facets" : {
        "age_range" : {
            "terms" : {
                "field" : "age_range"
            }
        }
    }
}

Or you can do it in a single query with a facet filter (see this link for further information)

{
    "query" : {
       "match_all": {}
    },
    "facets" : {
        "age_range_male" : {
            "terms" : {
                "field" : "age_range"
            },
            "facet_filter":{
                "term": {
                    "gender": "Male"
                }
            }
        },
        "age_range_female" : {
            "terms" : {
                "field" : "age_range"
            },
            "facet_filter":{
                "term": {
                    "gender": "Female"
                }
            }
        }
    }
}

Update:

As facets are about to be removed. This is the solution with aggregations:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "male": {
      "filter": {
        "term": {
          "gender": "Male"
        }
      },
      "aggs": {
        "age_range": {
          "terms": {
            "field": "age_range"
          }
        }
      }
    },
    "female": {
      "filter": {
        "term": {
          "gender": "Female"
        }
      },
      "aggs": {
        "age_range": {
          "terms": {
            "field": "age_range"
          }
        }
      }
    }
  }
}

这篇关于ElasticSearch由多个字段组成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆