ElasticSearch由多个字段组成 [英] ElasticSearch group by multiple fields
问题描述
我发现的唯一紧密的事情是: Elasticsearch中的多个group-by
基本上我试图获得与以下mysql查询相当的ES:
选择gender,age_range,count(distinct profile_id)as count FROM TABLE group by age_range,gender
自己的年龄和性别很容易得到:
{
query:{
match_all:{}
},
facets:{
ages:{
terms:{
:age_range,
size:20
}
},
gender_by_age:{
terms:{
fields :[
age_range,
gender
]
}
}
},
size:0
}
其中:
{
但是现在我需要的东西看起来像这样:
年龄: {
_type:terms,
missing:0,
total:193961,
other:0,
terms :[
{
term:0,
count:162643
},
{
term:3,
count:10683
},
{
term:4,
count:8931
},
{
term:5,
count:4690
},
{
term:6,
count:3647
},
{
term:2,
count:3247
},
{
term:1,
count:120
}
]
},
total_gender:{
_type:terms,
missing :0,
total:193961,
other:0,
terms:[
{
term:1,
count:94799
},
{
term:2,
count:62645
},
{
term:0,
count:36517
}
]
}
}
[breakdown_gender] => Array
(
[1] => Array
(
[0] => 264
[1] => 1
[2 ] => 6
[3] => 67
[4] => 72
[5] => 40
[6] =& b $ b)
[2] => Array
(
[0] => 153
[2] => 2
[3] => 21
[4] => 35
[5] => 22
[6] => 11
)
)
请注意,
0,1,2,3, 4,5,6
是年龄范围的映射,所以他们实际上意味着什么:)而不只是数字。例如性别[1](这是男性)分解为年龄段[0](即18岁以下),计数为246。解决方案由于您只有两个字段,一个简单的方法是使用单个方面进行两个查询。对于男性:
{
查询:{
term:{gender $
facets:{
age_range:{
terms:{
field:age_range
}
}
}
}
而对于女性:
{
查询:{
term:{gender
},
facets:{
age_range:{
terms:{
field:age_range
}
}
}
}
或者你可以做到在具有构面过滤器的单个查询中(有关详细信息,请参阅此链接)
{
查询:{
match_all:{}
},
facet:{
age_range_male:{
terms: {
field:age_range
},
facet_filter:{
term:{
gender:Male
}
},
age_range_female:{
terms:{
field:age_range
},
facet_filter:{
term:{
gender:女
}
}
}
}
}
更新:
即将被删除。这是具有汇总的解决方案:
{
query:{
match_all:{ }
},
aggs:{
male:{
filter:{
term:{
gender :$
aggs:{
age_range:{
terms:{
field $
女:{
过滤器:{
term: {
gender:女
}
},
aggs:{
age_range:{
terms
field:age_range
}
}
}
}
}
}
The only close thing that I've found was: Multiple group-by in Elasticsearch
Basically I'm trying to get the ES equivalent of the following mysql query:
select gender, age_range, count(distinct profile_id) as count FROM TABLE group by age_range, gender
The age and gender by themselves were easy to get:
{ "query": { "match_all": {} }, "facets": { "ages": { "terms": { "field": "age_range", "size": 20 } }, "gender_by_age": { "terms": { "fields": [ "age_range", "gender" ] } } }, "size": 0 }
which gives:
{ "ages": { "_type": "terms", "missing": 0, "total": 193961, "other": 0, "terms": [ { "term": 0, "count": 162643 }, { "term": 3, "count": 10683 }, { "term": 4, "count": 8931 }, { "term": 5, "count": 4690 }, { "term": 6, "count": 3647 }, { "term": 2, "count": 3247 }, { "term": 1, "count": 120 } ] }, "total_gender": { "_type": "terms", "missing": 0, "total": 193961, "other": 0, "terms": [ { "term": 1, "count": 94799 }, { "term": 2, "count": 62645 }, { "term": 0, "count": 36517 } ] } }
But now I need something that looks like this:
[breakdown_gender] => Array ( [1] => Array ( [0] => 264 [1] => 1 [2] => 6 [3] => 67 [4] => 72 [5] => 40 [6] => 23 ) [2] => Array ( [0] => 153 [2] => 2 [3] => 21 [4] => 35 [5] => 22 [6] => 11 ) )
Please note that
0,1,2,3,4,5,6
are "mappings" for the age ranges so they actually mean something :) and not just numbers. e.g. Gender[1] (which is "male") breaks down into age range [0] (which is "under 18") with a count of 246.解决方案As you only have 2 fields a simple way is doing two queries with single facets. For Male:
{ "query" : { "term" : { "gender" : "Male" } }, "facets" : { "age_range" : { "terms" : { "field" : "age_range" } } } }
And for female:
{ "query" : { "term" : { "gender" : "Female" } }, "facets" : { "age_range" : { "terms" : { "field" : "age_range" } } } }
Or you can do it in a single query with a facet filter (see this link for further information)
{ "query" : { "match_all": {} }, "facets" : { "age_range_male" : { "terms" : { "field" : "age_range" }, "facet_filter":{ "term": { "gender": "Male" } } }, "age_range_female" : { "terms" : { "field" : "age_range" }, "facet_filter":{ "term": { "gender": "Female" } } } } }
Update:
As facets are about to be removed. This is the solution with aggregations:
{ "query": { "match_all": {} }, "aggs": { "male": { "filter": { "term": { "gender": "Male" } }, "aggs": { "age_range": { "terms": { "field": "age_range" } } } }, "female": { "filter": { "term": { "gender": "Female" } }, "aggs": { "age_range": { "terms": { "field": "age_range" } } } } } }
这篇关于ElasticSearch由多个字段组成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!