弹性搜索 - 通用构面 - 计算与过滤器结合的聚合 [英] Elasticsearch - generic facets structure - calculating aggregations combined with filters

查看:183
本文介绍了弹性搜索 - 通用构面 - 计算与过滤器结合的聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我们的一个新项目中,我们受到了这篇文章的启发。 http://project-a.github.io/on-site-search-design-patterns-for-e-commerce/#generic-faceted-search

问题基本是X- 20045 X- 200 X- 200 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X- 200 200 X-我基本上想要的是,在计算其他方面时应该使用该品牌作为过滤器,而不是在计算品牌聚合时使用该品牌。这是必要的,因此用户可以选择多个品牌。



查看 https://www.contorion.de/search/Metabo_Fein/ou1-ou2?q=Winkelschleifer&c=bovy (这是上述文章中描述的网站),我选择了Metabo和Fein制造商(Hersteller),并展示Hersteller菜单,显示所有制造商,而不仅仅是选择的。所以我知道这是可能的,我希望有一个有一个提示如何编写聚合/过滤器,所以我得到正确的电子商务面行为。


$ b $在ES的产品中,我有以下结构:(与原始文章相同,虽然在命名中为C#if

 attributeStrings:[
{
facetName:属性,
facetValue:有机
},
{
facetName:Property,
facetValue:没有parfume
},
{
facetName:品牌,
facetValue:Adidas
}
]

所以以上产品有2个属性/ facet组 - 具有2个值的属性(Organic,Without parfume)和1个值的品牌(Adidas)。 X-4545454545×20045 CEEC X- 20045 X-454545 X- 20045 CEEC X- 20045 CEEC X- agg_attr_strings_filter:{
filter:{},
aggs:{
agg_attr_strings:{
nested:{
path attributeStrings
},
aggs:{
attr_name:{
terms:{
field:attributeStrings.facetName
},
aggs:{
attr_value:{
terms:{
field:attributeStrings.facetValue,
新评新新新新新新新旗新新新新旗新新新旗新新旗200 200 200 200 200 200 -40 200 200现在如果我选择属性有机和品牌阿迪达斯,我建立相同的聚合但是使用过滤器来应用这两个约束(这是一种错误的...):

  aggs:{
agg_attr_strings_filter:{
filter:{
bool:{
filter:[
{
嵌套:{
查询:{
bool:{
filter:[
{
term:{
attributeStrings.facetName:{
value:Property
}
}
},
{
条款:{$ b $新评新新新新新旗新新新新旗新新旗200新新新新旗新新旗200新新新新旗新新旗200新新新新旗新新旗200新新新新旗新新旗200新新新旗新1992新新旗新1992新新旗新1992新新旗新1992新新旗新1992新新旗新1992新新新旗新1992新新旗新新款:
$ bpath:attributeStrings
}
},
{
嵌套:{
查询:{
bool:{
filter:[
{
term:{
attributeStrings.facetName:{
value 品牌
}
}
},
{
条款:{
attributeStrings.facetValue:[
阿迪达斯
]
}
}
]
}
},
path:attributeStrings
}
}
]
}
},
aggs:{
agg_attr_strings:{
nested:{
path:attributeStrings
} ,
aggs:{
attr_name:{
terms:{
field:attributeStrings.facetName,
},
aggs:{
attr_value:{
terms:{
field:attributeStrings.facetValue,
size:1000,
order:[
{
_term:asc
}
]
}}}}}}}}

我可以看到这个模型的唯一方法是计算每个选定的方面的聚合,并以某种方式合并结果。但是,似乎非常复杂,有点失败了文章中描述的模型,所以我希望有一个更干净的解决方案,有人可以提供一些尝试的东西。

解决方案


我可以看到这个模型的唯一方法是计算每个选定的构面的聚合,并以某种方式合并结果。 p>

这是完全正确的。如果您选择了一个方面(例如品牌),如果您还想要获取其他品牌进行多选,则不能使用全球品牌过滤器。您可以做的是在所选方面应用所有其他过滤器,并在未选择的方面应用所有 过滤器。作为结果,您将具有 n + 1 n 所选过滤器的单独聚合 - 第一个是针对所有方面,休息用于选定的方面。新评新新新旗新新新旗新新新旗新新旗新新旗旗新1992新新新新旗新新旗新新旗200新新新新旗新新旗200新新新新旗新新旗200新新新旗新新旗200新新新新旗200新新新旗新1992: $ baggs:{
agg_attr_strings_filter:{
filter:{
bool:{
filter:[
{$ b X- 20045 X-454545 X-454545 X-454545 X-454545 X- 20045 CEEC X- 20045 CEEC X- $ battributeStrings.facetName:{
value:Property
}
}
},
{
terms:{
attributeStrings.facetValue:[
有机
]
}
}
]
}
},
path:attributeStrings
}
},
{
嵌套:{
查询:{
bool
filter:[
{
term:{
attributeStrings.facetName:{
value:Brand
}
}
},
{
条款:{
attributeStrings.facetValue:[
Adidas
]
}
}
]
}
},
path:attributeStrings
}
}
]
}
},
aggs: {
agg_attr_strings:{
nested:{
path:attributeStrings
},
aggs:{
attr_name:{
terms:{
field:attributeStrings.facetName
},
aggs:{
attr_value
条款:{
field:attributeStrings.facetValue,
size:1000,
order:[
{
_term:asc
}
]
}
}
}
}
}
}

},
special_agg_property:{
filter:{
nested:{
query:{
bool:{
filter:[
{
term:{
attributeStrings.facetName:{
value:品牌
}
}
},
{
条款:{
attributeStrings.facetValue:[
Adidas
]
}
}
]

},
path:attributeStrings
}
},
aggs:{
special_agg_property:{
nested:{
path:attributeStrings
},
aggs:{
agg_filtered_special:{
filter {
query:{
match:{
attributeStrings.facetName:Property
}
}
},
aggs:{
facet_value:{
terms:{
size:1000,
field:attributeStrings。 facetValue
}
}
}
}
}
}
}
},
special_agg_brand :{
filter:{
nested:{
query:{
bool:{
filter:[
{
term:{
attributeStrings.facetName:{
value:Property
}
}
},
{
条款:{
attributeStrings.facetValue:[
有机
]
}
}
]
}
},
path:attributeStrings
}
},
aggs:{
special_agg_brand: {
nested:{
path:attributeStrings
},
aggs:{
agg_filtered_special:{
过滤器:{
查询:{
匹配:{
attributeStrings.facetName:品牌
}
}
} ,
aggs:{
facet_value:{
terms:{
size:1000,
field:attributeStrings.facetValue
}
}
}
}
}
}
}
}
}
}

此查询看起来超大而可怕,生成这样的查询可以用几十行代码来完成。
解析查询结果时,您需要首先解析一般聚合(一个使用所有过滤器)和特殊构面聚合之后。从上面的例子中,首先从 agg_attr_strings_filter 中解析结果,但这些结果还将包含品牌属性的聚合值,应该被 special_agg_property special_agg_brand
的汇总值覆盖。另外,这个查询是有效的,因为Elasticsearch做的很好在缓存单独的过滤器子句中的作业,因此在查询的不同部分应用相同的过滤器应该是便宜的。


但是,似乎非常复杂和种类的失败有文章中描述的模型,所以我希望有一个更干净的解决方案,有人可以提供一些尝试尝试的东西。


真的没有办法解决事实,你需要应用不同的过滤器到不同的方面,同时有不同的查询过滤器。如果您需要支持正确的电子商务方面行为,您将会有复杂的查询:)



免责声明:我是上述文章。


In a new project of ours, we were inspired by this article http://project-a.github.io/on-site-search-design-patterns-for-e-commerce/#generic-faceted-search for doing our "facet" structure. And while I have got it working to the extent the article describes, I have run into issues in getting it to work when selecting facets. I hope someone can give a hint as to something to try, so I don’t have to redo all our aggregations into separate aggregation calculations again.

The problem is basically that we are using a single aggregation to calculate all the "facets" at once, but when I add a filter (fx. checking a brand name), then it "removes" all the other brands when returning the aggregates. What I basically want is that it should use that brand as filter when calculating the other facets, but not when calculating the brand aggregations. This is necessary so the user can, for example, choose multiple brands.

Looking at https://www.contorion.de/search/Metabo_Fein/ou1-ou2?q=Winkelschleifer&c=bovy (which is the site described in the above article), I have selected the "Metabo" and "Fein" manufacturer (Hersteller), and unfolding the Hersteller menu it shows all manufacturers and not just the ones selected. So I know it’s possible somehow and I hope some one out there has a hint as to how to write the aggregations / filters, so I get the "correct e-commerce facet behavior".

On the products in ES I have the following structure: (the same as in the original article, though "C#’ified" in naming)

"attributeStrings": [
    {
        "facetName": "Property",
        "facetValue": "Organic"
    },
    {
        "facetName": "Property",
        "facetValue": "Without parfume"
    },
    {
        "facetName": "Brand",
        "facetValue": "Adidas"
    }
]

So the above product has 2 attributes/facet groups – Property with 2 values (Organic, Without parfume) and Brand with 1 value (Adidas). Without any filters I calculate the aggregations from the following query:

  "aggs": {
    "agg_attr_strings_filter": {
      "filter": {},
      "aggs": {
        "agg_attr_strings": {
          "nested": {
            "path": "attributeStrings"
          },
          "aggs": {
            "attr_name": {
              "terms": {
                "field": "attributeStrings.facetName"
              },
              "aggs": {
                "attr_value": {
                  "terms": {
                    "field": "attributeStrings.facetValue",
                    "size": 1000,
                    "order": [
                      {
                        "_term": "asc"
                      }
                    ]
   } } } } } } } }

Now if I select Property "Organic" and Brand "Adidas" I build the same aggregation, but with a filter to apply those two constraints (which is were it kind of goes wrong...):

  "aggs": {
    "agg_attr_strings_filter": {
      "filter": {
        "bool": {
          "filter": [
            {
              "nested": {
                "query": {
                  "bool": {
                    "filter": [
                      {
                        "term": {
                          "attributeStrings.facetName": {
                            "value": "Property"
                          }
                        }
                      },
                      {
                        "terms": {
                          "attributeStrings.facetValue": [
                            "Organic"
                          ]
                        }
                      }
                    ]
                  }
                },
                "path": "attributeStrings"
              }
            },
            {
              "nested": {
                "query": {
                  "bool": {
                    "filter": [
                      {
                        "term": {
                          "attributeStrings.facetName": {
                            "value": "Brand"
                          }
                        }
                      },
                      {
                        "terms": {
                          "attributeStrings.facetValue": [
                            "Adidas"
                          ]
                        }
                      }
                    ]
                  }
                },
                "path": "attributeStrings"
              }
            }
          ]
        }
      },
      "aggs": {
        "agg_attr_strings": {
          "nested": {
            "path": "attributeStrings"
          },
          "aggs": {
            "attr_name": {
              "terms": {
                "field": "attributeStrings.facetName",
              },
              "aggs": {
                "attr_value": {
                  "terms": {
                    "field": "attributeStrings.facetValue",
                    "size": 1000,
                    "order": [
                      {
                        "_term": "asc"
                      }
                    ]
   } } } } } } } }

The only way I can see forward with this model, is to calculate the aggregation for each selected facet and somehow merge the result. But it seems very complex and kind of defeats the point of having the model as described in the article, so I hope there's a more clean solution and someone can give a hint at something to try.

解决方案

The only way I can see forward with this model, is to calculate the aggregation for each selected facet and somehow merge the result.

This is exactly right. If one facet (e.g. brand) is selected than you can not use global brand filter if you also want to fetch other brands for multi-selection. What you can do is apply all other filters on selected facets, and all filters on non-selected facets. As a results you will have n+1 separate aggregations for n selected filters - first one is for all facets and the rest are for selected facets.

In your case query might look like:

{
  "aggs": {
    "agg_attr_strings_filter": {
      "filter": {
        "bool": {
          "filter": [
            {
              "nested": {
                "query": {
                  "bool": {
                    "filter": [
                      {
                        "term": {
                          "attributeStrings.facetName": {
                            "value": "Property"
                          }
                        }
                      },
                      {
                        "terms": {
                          "attributeStrings.facetValue": [
                            "Organic"
                          ]
                        }
                      }
                    ]
                  }
                },
                "path": "attributeStrings"
              }
            },
            {
              "nested": {
                "query": {
                  "bool": {
                    "filter": [
                      {
                        "term": {
                          "attributeStrings.facetName": {
                            "value": "Brand"
                          }
                        }
                      },
                      {
                        "terms": {
                          "attributeStrings.facetValue": [
                            "Adidas"
                          ]
                        }
                      }
                    ]
                  }
                },
                "path": "attributeStrings"
              }
            }
          ]
        }
      },
      "aggs": {
        "agg_attr_strings": {
          "nested": {
            "path": "attributeStrings"
          },
          "aggs": {
            "attr_name": {
              "terms": {
                "field": "attributeStrings.facetName"
              },
              "aggs": {
                "attr_value": {
                  "terms": {
                    "field": "attributeStrings.facetValue",
                    "size": 1000,
                    "order": [
                      {
                        "_term": "asc"
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      }
    },
    "special_agg_property": {
      "filter": {
        "nested": {
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "attributeStrings.facetName": {
                      "value": "Brand"
                    }
                  }
                },
                {
                  "terms": {
                    "attributeStrings.facetValue": [
                      "Adidas"
                    ]
                  }
                }
              ]
            }
          },
          "path": "attributeStrings"
        }
      },
      "aggs": {
        "special_agg_property": {
          "nested": {
            "path": "attributeStrings"
          },
          "aggs": {
            "agg_filtered_special": {
              "filter": {
                "query": {
                  "match": {
                    "attributeStrings.facetName": "Property"
                  }
                }
              },
              "aggs": {
                "facet_value": {
                  "terms": {
                    "size": 1000,
                    "field": "attributeStrings.facetValue"
                  }
                }
              }
            }
          }
        }
      }
    },
    "special_agg_brand": {
      "filter": {
        "nested": {
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "attributeStrings.facetName": {
                      "value": "Property"
                    }
                  }
                },
                {
                  "terms": {
                    "attributeStrings.facetValue": [
                      "Organic"
                    ]
                  }
                }
              ]
            }
          },
          "path": "attributeStrings"
        }
      },
      "aggs": {
        "special_agg_brand: {
          "nested": {
            "path": "attributeStrings"
          },
          "aggs": {
            "agg_filtered_special": {
              "filter": {
                "query": {
                  "match": {
                    "attributeStrings.facetName": "Brand"
                  }
                }
              },
              "aggs": {
                "facet_value": {
                  "terms": {
                    "size": 1000,
                    "field": "attributeStrings.facetValue"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

This query looks super big and scary but generating such query can be done with few dozen lines of code. When parsing query results, you need to first parse general aggregation (one that uses all filters) and after special facet aggregations. From the upper example, first parse results from agg_attr_strings_filter but those results will also contain aggregation values for Brand and Property that should be overwritten by aggregation values from special_agg_property and special_agg_brand Also, this query is efficient since Elasticsearch does good job in caching separate filter clauses so applying same filters in different parts of query should be cheap.

But it seems very complex and kind of defeats the point of having the model as described in the article, so I hope there's a more clean solution and someone can give a hint at something to try.

There is really no way around the fact that you need to apply different filters to different facets and at the same time have different query filters. If you need to support "correct e-commerce facet behavior" you will have complex query :)

Disclaimer: I'm coauthor of the mentioned article.

这篇关于弹性搜索 - 通用构面 - 计算与过滤器结合的聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆