弹性搜索:排除过滤器虽然可能? (像Solr) [英] Elasticsearch: excluding filters while faceting possible? (like in Solr)
问题描述
我找不到信息之一是ES是否允许我在刻面时定义排除过滤器。
例如,考虑 producttype
,值为: A,B,C
我想要的(即:显示计数)。还要考虑查询被约束到 producttype:A
。
在这种情况下,Solr允许我指定我想排除 producttype:A
producttype
。 IOW,它显示在 producttype
上的计数,就好像约束 producttype:A
尚未应用。
如何在Solr中执行此操作请参阅: http: //wiki.apache.org/solr/SimpleFacetParameters >标记并排除过滤器
有没有办法在ElasticSearch中执行此操作?
是的,你可以。
虽然您可以在查询DSL中使用过滤器,但是搜索API也接受顶级过滤器
参数,用于过滤搜索结果后面的计算。
例如:
1)首先,创建您的索引,并且因为您希望 product_type
被视为枚举,请将其设置为 not_analyzed
:
curl -XPUT'http://127.0.0.1:9200/my_index/?pretty=1'-d'
{
mappings:{
product:{
properties:{
product_type:{
index:not_analyzed ,
type:string
},
product_name:{
type:string
}
}
}
}
}
'
2)索引一些文档(注意,doc 3具有不同的 product_name
):
curl -XPUT'http://127.0.0.1:9200/my_index/product/1?pretty=1'-d'
{
product_type:A,
product_name:foo bar
}
'
curl -XPUT'http://127.0.0.1:9200/ my_index / product / 2?pretty = 1'-d'
{
product_type:B,
product_name:foo bar
}
'
curl -XPUT'http://127.0.0.1:9200/my_index/product/3?pretty=1'-d'
{
product_type:C
product_name:bar
}
'
3)搜索名称包含 foo
的产品(不包括文档3,从而 product_type
C
),为 product_type
为 foo
在 product_name
,然后过滤搜索结果 product_type
== A
:
curl -XGET'http://127.0.0.1:9200/my_index/product/_search?pretty=1' d'
{
query:{
text:{
product_name:foo
}
},
过滤器:{
term:{
product_type:A
}
},
facets:{
product_type :{
terms:{
field:product_type
}
}
}
}
'
#{
#hits:{
#hits:[
#{
#_source:{
# product_type:A,
#product_name:foo bar
#},
#_score:0.19178301,
#_index:my_index ,
#_id:1,
#_type:product
#}
#],
#max_score:0.19178301,
#total:1
#},
#timed_out:false,
#_shards:{
#failed:0,
#success:5,
# total:5
#},
#facets:{
#product_type:{
#other:0,
#terms :[
#{
#count:1,
#term:B
#},
#{
# count:1,
#term:A
#}
#],
#missing:0,
#_type :条款,
#total:2
#}
#},
#taken:3
#}
4)在 foo
> product_name ,但是通过指定全局
参数来计算索引中所有产品的方面:
#[Wed Jan 18 17:15:09 2012]协议:http,服务器:192.168.5.10:9200
curl -XGET'http://127.0。 0.1:9200 / my_index / product / _search?pretty = 1'-d'
{
query:{
text:{
prod uct_name:foo
}
},
过滤器:{
term:{
product_type:A
}
},
facets:{
product_type:{
global:1,
terms:{
:product_type
}
}
}
}
'
#[Wed Jan 18 17:15:09 2012]回复:
#{
#hits:{
#hits:[
#{
#_source:{
# product_type:A,
#product_name:foo bar
#},
#_score:0.19178301,
#_index:my_index ,
#_id:1,
#_type:product
#}
#],
#max_score:0.19178301,
#total:1
#},
#timed_out:false,
#_shards:{
#failed:0,
#success:5,
#total:5
#},
#facets:{
#product_type:{
#other:0,
#条款:[
#{
#count:1,
#term:C
#},
#{
#count:1,
#term:B
#},
#{
#count:1,
# :A
#}
#],
#missing:0,
#_type:terms,
#total :3
#}
#},
#taken:4
#}
更新以解答OP中的扩展问题:
您还可以直接将过滤器应用于每个方面 - 这些称为 facet_filters
。
类似于之前的例子:
1)创建索引:
curl -XPUT'htt p://127.0.0.1:9200 / my_index /?pretty = 1'-d'
{
mappings:{
product:{
properties :{
color:{
index:not_analyzed,
type:string
},
name:{
type:string
},
type:{
index:not_analyzed,
type:string
}
}
}
}
}
'
2)索引一些数据:
curl -XPUT'http://127.0.0.1:9200 / my_index / product / 1?pretty = 1'-d'
{
color:red,
name:foo bar,
type :A
}
'
curl -XPUT'http://127.0.0.1:9200/my_index/product/2?pretty=1'-d'
{
color:[
red,
blue
],
name:foo bar,
type:B
}
'
curl -XPUT'http://127.0.0.1:9200/my_index/product/3?pretty=1'-d'
{
color:[
green,
blue
],
name:bar,
type :C
}
'
3)搜索,过滤产品有类型
== A
和颜色
== blue
,然后在每个属性上运行facet,不包括其他过滤器:
curl -XGET'http://127.0.0.1:9200/my_index/product/_search?pretty=1'-d'
{
过滤器:{
和:[
{
term:{
color:blue
}
},
{
:{
type:A
}
}
]
},
facets:{
color :{
terms:{
field:color
},
facet_filter:{
term:{
type:A
}
}
},
type :{
terms:{
field:type
},
facet_filter:{
term:{
color:blue
}
}
}
}
}
'
#[Wed Jan 18 19:58:25 2012]回复:
#{
#hits:{
#hits:[],
#max_score:null,
#total:0
#},
#timed_out:false,
#_shards:{
#failed:0,
#成功:5,
#总:5
#},
#facets:{
#color:{
# 其他:0,
#条款:[
#{
#count:1,
#term:red
# }
#],
#missi ng:0,
#_type:terms,
#total:1
#},
#type:{
# 其他:0,
#条款:[
#{
#count:1,
#term:C
# },
#{
#count:1,
#term:B
#}
#],
#缺少:0,
#_type:terms,
#total:2
#}
#},
#taken 3
#}
I'm looking into changing from Solr to ES. One of the things I can't find info about is whether ES lets me define exclusion filters when faceting.
For example consider producttype
with values: A,B,C
which I want to facet on (i.e: show counts for). Also consider that the query is constrained to producttype: A
.
In this case Solr allows me to specify that I want to exclude the contraint producttype: A
from impacting faceting on producttype
. IOW, it displays counts on producttype
as if the constraint producttype: A
has not been applied.
How to do this in Solr see: http://wiki.apache.org/solr/SimpleFacetParameters > Tagging and excluding Filters
Is there any way to do this in ElasticSearch?
Yes you can.
While you can use filters within the query DSL, the search API also accepts a top-level filter
parameter, which is used for filtering the search results AFTER the facets have been calculated.
For example:
1) First, create your index, and because you want product_type
to be treated as an enum, set it to be not_analyzed
:
curl -XPUT 'http://127.0.0.1:9200/my_index/?pretty=1' -d '
{
"mappings" : {
"product" : {
"properties" : {
"product_type" : {
"index" : "not_analyzed",
"type" : "string"
},
"product_name" : {
"type" : "string"
}
}
}
}
}
'
2) Index some docs (note, doc 3 has a different product_name
):
curl -XPUT 'http://127.0.0.1:9200/my_index/product/1?pretty=1' -d '
{
"product_type" : "A",
"product_name" : "foo bar"
}
'
curl -XPUT 'http://127.0.0.1:9200/my_index/product/2?pretty=1' -d '
{
"product_type" : "B",
"product_name" : "foo bar"
}
'
curl -XPUT 'http://127.0.0.1:9200/my_index/product/3?pretty=1' -d '
{
"product_type" : "C",
"product_name" : "bar"
}
'
3) Perform a search for products whose name contains foo
(which excludes doc 3 and thus product_type
C
), calculate facets for product_type
for all docs which have foo
in the product_name
, then filter the search results by product_type
== A
:
curl -XGET 'http://127.0.0.1:9200/my_index/product/_search?pretty=1' -d '
{
"query" : {
"text" : {
"product_name" : "foo"
}
},
"filter" : {
"term" : {
"product_type" : "A"
}
},
"facets" : {
"product_type" : {
"terms" : {
"field" : "product_type"
}
}
}
}
'
# {
# "hits" : {
# "hits" : [
# {
# "_source" : {
# "product_type" : "A",
# "product_name" : "foo bar"
# },
# "_score" : 0.19178301,
# "_index" : "my_index",
# "_id" : "1",
# "_type" : "product"
# }
# ],
# "max_score" : 0.19178301,
# "total" : 1
# },
# "timed_out" : false,
# "_shards" : {
# "failed" : 0,
# "successful" : 5,
# "total" : 5
# },
# "facets" : {
# "product_type" : {
# "other" : 0,
# "terms" : [
# {
# "count" : 1,
# "term" : "B"
# },
# {
# "count" : 1,
# "term" : "A"
# }
# ],
# "missing" : 0,
# "_type" : "terms",
# "total" : 2
# }
# },
# "took" : 3
# }
4) Perform a search for foo
in the product_name
, but calculate facets for all products in the index, by specifying the global
parameter:
# [Wed Jan 18 17:15:09 2012] Protocol: http, Server: 192.168.5.10:9200
curl -XGET 'http://127.0.0.1:9200/my_index/product/_search?pretty=1' -d '
{
"query" : {
"text" : {
"product_name" : "foo"
}
},
"filter" : {
"term" : {
"product_type" : "A"
}
},
"facets" : {
"product_type" : {
"global" : 1,
"terms" : {
"field" : "product_type"
}
}
}
}
'
# [Wed Jan 18 17:15:09 2012] Response:
# {
# "hits" : {
# "hits" : [
# {
# "_source" : {
# "product_type" : "A",
# "product_name" : "foo bar"
# },
# "_score" : 0.19178301,
# "_index" : "my_index",
# "_id" : "1",
# "_type" : "product"
# }
# ],
# "max_score" : 0.19178301,
# "total" : 1
# },
# "timed_out" : false,
# "_shards" : {
# "failed" : 0,
# "successful" : 5,
# "total" : 5
# },
# "facets" : {
# "product_type" : {
# "other" : 0,
# "terms" : [
# {
# "count" : 1,
# "term" : "C"
# },
# {
# "count" : 1,
# "term" : "B"
# },
# {
# "count" : 1,
# "term" : "A"
# }
# ],
# "missing" : 0,
# "_type" : "terms",
# "total" : 3
# }
# },
# "took" : 4
# }
UPDATE TO ANSWER THE EXPANDED QUESTION FROM THE OP:
You can also apply filters directly to each facet - these are called facet_filters
.
Similar example to before:
1) Create the index:
curl -XPUT 'http://127.0.0.1:9200/my_index/?pretty=1' -d '
{
"mappings" : {
"product" : {
"properties" : {
"color" : {
"index" : "not_analyzed",
"type" : "string"
},
"name" : {
"type" : "string"
},
"type" : {
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}
'
2) Index some data:
curl -XPUT 'http://127.0.0.1:9200/my_index/product/1?pretty=1' -d '
{
"color" : "red",
"name" : "foo bar",
"type" : "A"
}
'
curl -XPUT 'http://127.0.0.1:9200/my_index/product/2?pretty=1' -d '
{
"color" : [
"red",
"blue"
],
"name" : "foo bar",
"type" : "B"
}
'
curl -XPUT 'http://127.0.0.1:9200/my_index/product/3?pretty=1' -d '
{
"color" : [
"green",
"blue"
],
"name" : "bar",
"type" : "C"
}
'
3) Search, filtering on products that have both type
==A
and color
== blue
, then run facets on each attribute excluding, the "other" filter:
curl -XGET 'http://127.0.0.1:9200/my_index/product/_search?pretty=1' -d '
{
"filter" : {
"and" : [
{
"term" : {
"color" : "blue"
}
},
{
"term" : {
"type" : "A"
}
}
]
},
"facets" : {
"color" : {
"terms" : {
"field" : "color"
},
"facet_filter" : {
"term" : {
"type" : "A"
}
}
},
"type" : {
"terms" : {
"field" : "type"
},
"facet_filter" : {
"term" : {
"color" : "blue"
}
}
}
}
}
'
# [Wed Jan 18 19:58:25 2012] Response:
# {
# "hits" : {
# "hits" : [],
# "max_score" : null,
# "total" : 0
# },
# "timed_out" : false,
# "_shards" : {
# "failed" : 0,
# "successful" : 5,
# "total" : 5
# },
# "facets" : {
# "color" : {
# "other" : 0,
# "terms" : [
# {
# "count" : 1,
# "term" : "red"
# }
# ],
# "missing" : 0,
# "_type" : "terms",
# "total" : 1
# },
# "type" : {
# "other" : 0,
# "terms" : [
# {
# "count" : 1,
# "term" : "C"
# },
# {
# "count" : 1,
# "term" : "B"
# }
# ],
# "missing" : 0,
# "_type" : "terms",
# "total" : 2
# }
# },
# "took" : 3
# }
这篇关于弹性搜索:排除过滤器虽然可能? (像Solr)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!