如何在Solr中的多个字段上执行嵌套聚合? [英] How to perform nested aggregation on multiple fields in Solr?
问题描述
我试图以嵌套的方式对几个字段进行搜索结果聚合(计数和求和)分组。
I am trying to perform search result aggregation (count and sum) grouping by several fields in a nested fashion.
例如,最后显示的模式这篇文章,我希望能够得到大小的总和按类别分组,并进一步按子类别进行分组,得到这样的结果:
For example, with the schema shown at the end of this post, I'd like to be able to get the sum of "size" grouped by "category" and sub-grouped further by "subcategory" and get something like this:
<category name="X">
<subcategory name="X_A">
<size sum="..." />
</subcategory>
<subcategory name="X_B">
<size sum="..." />
</subcategory>
</category>
....
我一直在寻找Solr的Stats组件,我可以看到,不允许嵌套聚合。
I've been looking primarily at Solr's Stats component which, as far as I can see, doesn't allow nested aggregation.
如果有人知道某种方式来实现这一点,无论是否有Stats组件,我都会感激不尽。
I'd appreciate it if anyone knows of some way to implement this, with or without the Stats component.
这是目标模式的简化版本:
Here is a cut-down version of the target schema:
<types>
<fieldType name="string" class="solr.StrField" />
<fieldType name="text" class="solr.TextField">
<analyzer><tokenizer class="solr.StandardTokenizerFactory" /></analyzer>
</fieldType>
<fieldType name="date" class="solr.DateField" />
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
</types>
<fields>
<field name="id" type="string" indexed="true" stored="true" />
<field name="category" type="text" indexed="true" stored="true" />
<field name="subcategory" type="text" indexed="true" stored="true" />
<field name="pdate" type="date" indexed="true" stored="true" />
<field name="size" type="int" indexed="true" stored="true" />
</fields>
推荐答案
Solr 5.1中的新分面模块可以做到这一点,它已添加到 https://issues.apache.org/jira/browse/SOLR-7214
The new faceting module in Solr 5.1 can do this, it was added in https://issues.apache.org/jira/browse/SOLR-7214
以下是对每个方面存储桶添加总和(大小)的方法,并按该统计数据降序排序。
Here is how you would add sum(size) to every facet bucket, and sort descending by that statistic.
json.facet={
categories:{terms:{
field:category,
sort:"total_size desc", // this will sort the facet buckets by your stat
facet:{
total_size:"sum(size)" // this calculates the stat per bucket
}
}}
}
这就是你要在子类别的子面上添加的方式:
And this is how you would add in the subfacet on subcategory:
json.facet={
categories:{terms:{
field:category,
sort:"total_size desc",
facet:{
total_size:"sum(size)",
subcat:{terms:{ // this will facet on the subcategory field for each bucket
field:subcategory,
facet:{
sz:"sum(size)" // this calculates the sum per sub-cat bucket
}}
}
}}
}
所以上面会给你类别和子类别级别的总和(大小)。新facet模块的文档目前位于 http://yonik.com/json-facet-api/
So the above will give you the sum(size) at both the category and subcategory levels. Documentation for the new facet module is currently at http://yonik.com/json-facet-api/
这篇关于如何在Solr中的多个字段上执行嵌套聚合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!