弹性搜索:排除过滤器虽然可能? (像Solr) [英] Elasticsearch: excluding filters while faceting possible? (like in Solr)

查看:142
本文介绍了弹性搜索:排除过滤器虽然可能? (像Solr)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究从Solr到ES的变化。
我找不到信息之一是ES是否允许我在刻面时定义排除过滤器。



例如,考虑 producttype ,值为: A,B,C 我想要的(即:显示计数)。还要考虑查询被约束到 producttype:A



在这种情况下,Solr允许我指定我想排除 producttype:A producttype 。 IOW,它显示在 producttype 上的计数,就好像约束 producttype:A 尚未应用。



如何在Solr中执行此操作请参阅: http: //wiki.apache.org/solr/SimpleFacetParameters >标记并排除过滤器



有没有办法在ElasticSearch中执行此操作?

解决方案

是的,你可以。



虽然您可以在查询DSL中使用过滤器,但是搜索API也接受顶级过滤器参数,用于过滤搜索结果后面的计算。



例如:



1)首先,创建您的索引,并且因为您希望 product_type 被视为枚举,请将其设置为 not_analyzed

  curl -XPUT'http://127.0.0.1:9200/my_index/?pretty=1'-d'
{
mappings:{
product:{
properties:{
product_type:{
index:not_analyzed ,
type:string
},
product_name:{
type:string
}
}
}
}
}
'

2)索引一些文档(注意,doc 3具有不同的 product_name ):

  curl -XPUT'http://127.0.0.1:9200/my_index/product/1?pretty=1'-d'
{
product_type:A,
product_name:foo bar
}
'
curl -XPUT'http://127.0.0.1:9200/ my_index / product / 2?pretty = 1'-d'
{
product_type:B,
product_name:foo bar
}
'
curl -XPUT'http://127.0.0.1:9200/my_index/product/3?pretty=1'-d'
{
product_type:C
product_name:bar
}
'

3)搜索名称包含 foo 的产品(不包括文档3,从而 product_type C ),为 product_type foo product_name ,然后过滤搜索结果 product_type == A

  curl -XGET'http://127.0.0.1:9200/my_index/product/_search?pretty=1' d' 
{
query:{
text:{
product_name:foo
}
},
过滤器:{
term:{
product_type:A
}
},
facets:{
product_type :{
terms:{
field:product_type
}
}
}
}
'

#{
#hits:{
#hits:[
#{
#_source:{
# product_type:A,
#product_name:foo bar
#},
#_score:0.19178301,
#_index:my_index ,
#_id:1,
#_type:product
#}
#],
#max_score:0.19178301,
#total:1
#},
#timed_out:false,
#_shards:{
#failed:0,
#success:5,
# total:5
#},
#facets:{
#product_type:{
#other:0,
#terms :[
#{
#count:1,
#term:B
#},
#{
# count:1,
#term:A
#}
#],
#missing:0,
#_type :条款,
#total:2
#}
#},
#taken:3
#}

4)在 foo > product_name ,但是通过指定全局参数来计算索引中所有产品的方面:

 #[Wed Jan 18 17:15:09 2012]协议:http,服务器:192.168.5.10:9200 
curl -XGET'http://127.0。 0.1:9200 / my_index / product / _search?pretty = 1'-d'
{
query:{
text:{
prod uct_name:foo
}
},
过滤器:{
term:{
product_type:A
}
},
facets:{
product_type:{
global:1,
terms:{
:product_type
}
}
}
}
'

#[Wed Jan 18 17:15:09 2012]回复:
#{
#hits:{
#hits:[
#{
#_source:{
# product_type:A,
#product_name:foo bar
#},
#_score:0.19178301,
#_index:my_index ,
#_id:1,
#_type:product
#}
#],
#max_score:0.19178301,
#total:1
#},
#timed_out:false,
#_shards:{
#failed:0,
#success:5,
#total:5
#},
#facets:{
#product_type:{
#other:0,
#条款:[
#{
#count:1,
#term:C
#},
#{
#count:1,
#term:B
#},
#{
#count:1,
# :A
#}
#],
#missing:0,
#_type:terms,
#total :3
#}
#},
#taken:4
#}

更新以解答OP中的扩展问题:



您还可以直接将过滤器应用于每个方面 - 这些称为 facet_filters



类似于之前的例子:



1)创建索引:

  curl -XPUT'htt p://127.0.0.1:9200 / my_index /?pretty = 1'-d'
{
mappings:{
product:{
properties :{
color:{
index:not_analyzed,
type:string
},
name:{
type:string
},
type:{
index:not_analyzed,
type:string
}
}
}
}
}
'

2)索引一些数据:

  curl -XPUT'http://127.0.0.1:9200 / my_index / product / 1?pretty = 1'-d'
{
color:red,
name:foo bar,
type :A
}
'

curl -XPUT'http://127.0.0.1:9200/my_index/product/2?pretty=1'-d'
{
color:[
red,
blue
],
name:foo bar,
type:B
}
'

curl -XPUT'http://127.0.0.1:9200/my_index/product/3?pretty=1'-d'
{
color:[
green,
blue
],
name:bar,
type :C
}
'

3)搜索,过滤产品有类型 == A 颜色 == blue ,然后在每个属性上运行facet,不包括其他过滤器:

  curl -XGET'http://127.0.0.1:9200/my_index/product/_search?pretty=1'-d'
{
过滤器:{
和:[
{
term:{
color:blue
}
},
{
:{
type:A
}
}
]
},
facets:{
color :{
terms:{
field:color
},
facet_filter:{
term:{
type:A
}
}
},
type :{
terms:{
field:type
},
facet_filter:{
term:{
color:blue
}
}
}
}
}
'

#[Wed Jan 18 19:58:25 2012]回复:
#{
#hits:{
#hits:[],
#max_score:null,
#total:0
#},
#timed_out:false,
#_shards:{
#failed:0,
#成功:5,
#总:5
#},
#facets:{
#color:{
# 其他:0,
#条款:[
#{
#count:1,
#term:red
# }
#],
#missi ng:0,
#_type:terms,
#total:1
#},
#type:{
# 其他:0,
#条款:[
#{
#count:1,
#term:C
# },
#{
#count:1,
#term:B
#}
#],
#缺少:0,
#_type:terms,
#total:2
#}
#},
#taken 3
#}


I'm looking into changing from Solr to ES. One of the things I can't find info about is whether ES lets me define exclusion filters when faceting.

For example consider producttype with values: A,B,C which I want to facet on (i.e: show counts for). Also consider that the query is constrained to producttype: A.

In this case Solr allows me to specify that I want to exclude the contraint producttype: A from impacting faceting on producttype. IOW, it displays counts on producttype as if the constraint producttype: A has not been applied.

How to do this in Solr see: http://wiki.apache.org/solr/SimpleFacetParameters > Tagging and excluding Filters

Is there any way to do this in ElasticSearch?

解决方案

Yes you can.

While you can use filters within the query DSL, the search API also accepts a top-level filter parameter, which is used for filtering the search results AFTER the facets have been calculated.

For example:

1) First, create your index, and because you want product_type to be treated as an enum, set it to be not_analyzed:

curl -XPUT 'http://127.0.0.1:9200/my_index/?pretty=1'  -d '
{
   "mappings" : {
      "product" : {
         "properties" : {
            "product_type" : {
               "index" : "not_analyzed",
               "type" : "string"
            },
            "product_name" : {
               "type" : "string"
            }
         }
      }
   }
}
'

2) Index some docs (note, doc 3 has a different product_name):

curl -XPUT 'http://127.0.0.1:9200/my_index/product/1?pretty=1'  -d '
{
   "product_type" : "A",
   "product_name" : "foo bar"
}
'
curl -XPUT 'http://127.0.0.1:9200/my_index/product/2?pretty=1'  -d '
{
   "product_type" : "B",
   "product_name" : "foo bar"
}
'
curl -XPUT 'http://127.0.0.1:9200/my_index/product/3?pretty=1'  -d '
{
   "product_type" : "C",
   "product_name" : "bar"
}
'

3) Perform a search for products whose name contains foo (which excludes doc 3 and thus product_type C), calculate facets for product_type for all docs which have foo in the product_name, then filter the search results by product_type == A:

curl -XGET 'http://127.0.0.1:9200/my_index/product/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "product_name" : "foo"
      }
   },
   "filter" : {
      "term" : {
         "product_type" : "A"
      }
   },
   "facets" : {
      "product_type" : {
         "terms" : {
            "field" : "product_type"
         }
      }
   }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "product_type" : "A",
#                "product_name" : "foo bar"
#             },
#             "_score" : 0.19178301,
#             "_index" : "my_index",
#             "_id" : "1",
#             "_type" : "product"
#          }
#       ],
#       "max_score" : 0.19178301,
#       "total" : 1
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "facets" : {
#       "product_type" : {
#          "other" : 0,
#          "terms" : [
#             {
#                "count" : 1,
#                "term" : "B"
#             },
#             {
#                "count" : 1,
#                "term" : "A"
#             }
#          ],
#          "missing" : 0,
#          "_type" : "terms",
#          "total" : 2
#       }
#    },
#    "took" : 3
# }

4) Perform a search for foo in the product_name, but calculate facets for all products in the index, by specifying the global parameter:

# [Wed Jan 18 17:15:09 2012] Protocol: http, Server: 192.168.5.10:9200
curl -XGET 'http://127.0.0.1:9200/my_index/product/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "product_name" : "foo"
      }
   },
   "filter" : {
      "term" : {
         "product_type" : "A"
      }
   },
   "facets" : {
      "product_type" : {
         "global" : 1,
         "terms" : {
            "field" : "product_type"
         }
      }
   }
}
'

# [Wed Jan 18 17:15:09 2012] Response:
# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "product_type" : "A",
#                "product_name" : "foo bar"
#             },
#             "_score" : 0.19178301,
#             "_index" : "my_index",
#             "_id" : "1",
#             "_type" : "product"
#          }
#       ],
#       "max_score" : 0.19178301,
#       "total" : 1
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "facets" : {
#       "product_type" : {
#          "other" : 0,
#          "terms" : [
#             {
#                "count" : 1,
#                "term" : "C"
#             },
#             {
#                "count" : 1,
#                "term" : "B"
#             },
#             {
#                "count" : 1,
#                "term" : "A"
#             }
#          ],
#          "missing" : 0,
#          "_type" : "terms",
#          "total" : 3
#       }
#    },
#    "took" : 4
# }

UPDATE TO ANSWER THE EXPANDED QUESTION FROM THE OP:

You can also apply filters directly to each facet - these are called facet_filters.

Similar example to before:

1) Create the index:

curl -XPUT 'http://127.0.0.1:9200/my_index/?pretty=1'  -d '
{
   "mappings" : {
      "product" : {
         "properties" : {
            "color" : {
               "index" : "not_analyzed",
               "type" : "string"
            },
            "name" : {
               "type" : "string"
            },
            "type" : {
               "index" : "not_analyzed",
               "type" : "string"
            }
         }
      }
   }
}
'

2) Index some data:

curl -XPUT 'http://127.0.0.1:9200/my_index/product/1?pretty=1'  -d '
{
   "color" : "red",
   "name" : "foo bar",
   "type" : "A"
}
'

curl -XPUT 'http://127.0.0.1:9200/my_index/product/2?pretty=1'  -d '
{
   "color" : [
      "red",
      "blue"
   ],
   "name" : "foo bar",
   "type" : "B"
}
'

curl -XPUT 'http://127.0.0.1:9200/my_index/product/3?pretty=1'  -d '
{
   "color" : [
      "green",
      "blue"
   ],
   "name" : "bar",
   "type" : "C"
}
'

3) Search, filtering on products that have both type==Aand color == blue, then run facets on each attribute excluding, the "other" filter:

curl -XGET 'http://127.0.0.1:9200/my_index/product/_search?pretty=1'  -d '
{
   "filter" : {
      "and" : [
         {
            "term" : {
               "color" : "blue"
            }
         },
         {
            "term" : {
               "type" : "A"
            }
         }
      ]
   },
   "facets" : {
      "color" : {
         "terms" : {
            "field" : "color"
         },
         "facet_filter" : {
            "term" : {
               "type" : "A"
            }
         }
      },
      "type" : {
         "terms" : {
            "field" : "type"
         },
         "facet_filter" : {
            "term" : {
               "color" : "blue"
            }
         }
      }
   }
}
'

# [Wed Jan 18 19:58:25 2012] Response:
# {
#    "hits" : {
#       "hits" : [],
#       "max_score" : null,
#       "total" : 0
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "facets" : {
#       "color" : {
#          "other" : 0,
#          "terms" : [
#             {
#                "count" : 1,
#                "term" : "red"
#             }
#          ],
#          "missing" : 0,
#          "_type" : "terms",
#          "total" : 1
#       },
#       "type" : {
#          "other" : 0,
#          "terms" : [
#             {
#                "count" : 1,
#                "term" : "C"
#             },
#             {
#                "count" : 1,
#                "term" : "B"
#             }
#          ],
#          "missing" : 0,
#          "_type" : "terms",
#          "total" : 2
#       }
#    },
#    "took" : 3
# }

这篇关于弹性搜索:排除过滤器虽然可能? (像Solr)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆