ElasticSearch - 使用数组字段的子项聚合问题 [英] ElasticSearch - issue with sub term aggregation with array fields

查看:1276
本文介绍了ElasticSearch - 使用数组字段的子项聚合问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个以下文件:

{  
"title":"The Avengers",
"year":2012,
"casting":[  
    {  
    "name":"Robert Downey Jr.",
    "category":"Actor",
    },
    {  
    "name":"Chris Evans",
    "category":"Actor",
    }
]
}

和:

{  
"title":"The Judge",
"year":2014,
"casting":[  
    {  
    "name":"Robert Downey Jr.",
    "category":"Producer",
    },
    {  
    "name":"Robert Duvall",
    "category":"Actor",
    }
]
}

我想执行聚合,基于两个字段:cast.name和cast.category。

I would like to perform aggregations, based on two fields : casting.name and casting.category.

我尝试使用基于cast.name字段的TermsAggregation,具有子集合,这是基于cast.category字段的另一个TermsAggregation。

I tried with a TermsAggregation based on casting.name field, with a subaggregation, which is another TermsAggregation based on the casting.category field.

问题是,对于Chris Evans条目,ElasticSearch设置所有类别的桶(Actor,Producer),而应该只设置1个桶(Actor)。

The problem is that for the "Chris Evans" entry, ElasticSearch set buckets for ALL categories (Actor, Producer) whereas it should set only 1 bucket (Actor).

似乎有一个笛卡儿所有的cast.category发生之间的产物和所有的cast.name发生。
它的行为类似于数组字段(cast),而我没有简单的字段(作为标题或年份)的问题。

It seems that there is a cartesian product between all casting.category occurences and all casting.name occurences. It behaves like this with array fields (casting), whereas I don't have the problem with simple fields (as title, or year).

我也尝试使用嵌套的聚合,但可能不正确,ElasticSearch会抛出一个错误,告诉cast.res不是嵌套字段。

I also tried to use nested aggregations, but maybe not properly, and ElasticSearch throws an error telling that casting.category is not a nested field.

这里有什么想法? >

Any idea here?

推荐答案

弹性搜索将展开嵌套对象,因此内部您将得到:

Elasticsearch will flatten the nested objects, so internally you will get:

{  
"title":"The Judge",
"year":2014,
"casting.name": ["Robert Downey Jr.","Robert Duvall"],
"casting.category": ["Producer", "Actor"]
}

如果你想保持关系,你需要使用嵌套对象 parent c hild关系

if you want to keep the relationship you'll need to use either nested objects or a parent child relationship

要进行嵌套映射,您需要执行以下操作:

To do a nested mapping you'd need to do something like this:

  "mappings": {
    "movies": {
      "properties": {
        "title" : { "type": "string" },
        "year" : { "type": "integer" },
        "casting": {
          "type": "nested", 
          "properties": {
            "name":    { "type": "string" },
            "category": { "type": "string" }
          }
        }
      }
    }
  }

这篇关于ElasticSearch - 使用数组字段的子项聚合问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆