如何在Solr中的多个字段上执行嵌套聚合? [英] How to perform nested aggregation on multiple fields in Solr?

查看:841
本文介绍了如何在Solr中的多个字段上执行嵌套聚合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图以嵌套的方式对几个字段进行搜索结果聚合(计数和求和)分组。

I am trying to perform search result aggregation (count and sum) grouping by several fields in a nested fashion.

例如,最后显示的模式这篇文章,我希望能够得到大小的总和按类别分组,并进一步按子类别进行分组,得到这样的结果:

For example, with the schema shown at the end of this post, I'd like to be able to get the sum of "size" grouped by "category" and sub-grouped further by "subcategory" and get something like this:

<category name="X">
  <subcategory name="X_A">
    <size sum="..." />
  </subcategory>
  <subcategory name="X_B">
    <size sum="..." />
  </subcategory>
</category>
....

我一直在寻找Solr的Stats组件,我可以看到,不允许嵌套聚合。

I've been looking primarily at Solr's Stats component which, as far as I can see, doesn't allow nested aggregation.

如果有人知道某种方式来实现这一点,无论是否有Stats组件,我都会感激不尽。

I'd appreciate it if anyone knows of some way to implement this, with or without the Stats component.

这是目标模式的简化版本:

Here is a cut-down version of the target schema:

<types>
  <fieldType name="string" class="solr.StrField" />
  <fieldType name="text" class="solr.TextField">
    <analyzer><tokenizer class="solr.StandardTokenizerFactory" /></analyzer>
  </fieldType>
  <fieldType name="date" class="solr.DateField" />
  <fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0" />
</types>

<fields>
  <field name="id" type="string" indexed="true" stored="true" />
  <field name="category" type="text" indexed="true" stored="true" />
  <field name="subcategory" type="text" indexed="true" stored="true" />
  <field name="pdate" type="date" indexed="true" stored="true" />
  <field name="size" type="int" indexed="true" stored="true" />
</fields>


推荐答案

Solr 5.1中的新分面模块可以做到这一点,它已添加到 https://issues.apache.org/jira/browse/SOLR-7214

The new faceting module in Solr 5.1 can do this, it was added in https://issues.apache.org/jira/browse/SOLR-7214

以下是对每个方面存储桶添加总和(大小)的方法,并按该统计数据降序排序。

Here is how you would add sum(size) to every facet bucket, and sort descending by that statistic.

json.facet={
  categories:{terms:{
    field:category,
    sort:"total_size desc",  // this will sort the facet buckets by your stat 
    facet:{
      total_size:"sum(size)"  // this calculates the stat per bucket
    }
  }}
}

这就是你要在子类别的子面上添加的方式:

And this is how you would add in the subfacet on subcategory:

json.facet={
  categories:{terms:{
    field:category,
    sort:"total_size desc",
    facet:{
      total_size:"sum(size)",
      subcat:{terms:{ // this will facet on the subcategory field for each bucket
        field:subcategory,
        facet:{
         sz:"sum(size)"  // this calculates the sum per sub-cat bucket          
      }}
    }
  }}
}

所以上面会给你类别和子类别级别的总和(大小)。新facet模块的文档目前位于 http://yonik.com/json-facet-api/

So the above will give you the sum(size) at both the category and subcategory levels. Documentation for the new facet module is currently at http://yonik.com/json-facet-api/

这篇关于如何在Solr中的多个字段上执行嵌套聚合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆