如何在Elasticsearch中同时搜索单数形式和复数形式的单词? [英] How to search both singular and plural form of word in elasticsearch?
问题描述
我正在使用Q对象进行弹性查询,并且已经建立了索引文档,其中一个文档包含"jbl说话者很棒",但是我的查询中却包含"speaker"而不是说话者,我该如何使用查询字符串查找此文档./p>
我尝试了match_phrase,但是找不到该文档,当我尝试使用query_string时,抛出了一个错误,提示"query_string不支持某些键".我也尝试过通配符,但这也不能用于
之类的查询. {询问": {布尔":{必须": [{"match_phrase":{"prod_group":"06"}},{"match_phrase":{"prod_group":服装"}},{通配符":{"prod_cat_for_search":"+发言人*"}},{范围": {日期": {"gte":"2018-04-07"}}}]}}}
Q('match_phrase',prod_cat_for_search ='speaker')
我希望输出文档包含发言人,但实际输出是没有包含发言人的文件
要查找的搜索类型可以通过使用 对于上面映射中的 例如如果我们将文档编入索引如下: 将被索引的令牌为: 通知<扬声器> 扬声器>索引为<扬声器> 完美扬声器>和<完美扬声器>完美扬声器. 现在,如果您搜索 为什么 有关梗塞的更多信息. I am making elastic query using Q object and I have indexed documents, one of the documents contains "jbl speakers are great", but my query has "speaker" instead of speakers how can I find this document with query string. I have tried match_phrase but it is unable to find this document and when I had tried query_string it threw an error saying "query_string does not support for some key". I have also tried wildcard but that is also not working with query like
I expect the output document containing speakers but
actual output is no document containing speakers The type of search you are looking for can be achieved by using stemmer token filter at the time of indexing. Lets see how it work using the example mapping as below: For the field For e.g. if we index a document as below: The tokens that will get indexed are: Notice Now if you search for Why More on stemming. 这篇关于如何在Elasticsearch中同时搜索单数形式和复数形式的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! description
字段,我们将分析器用作 my_analyzer
.该分析器将应用令牌过滤器小写
和 my_stemmer
. my_stemmer
将对输入值应用 english
.
{描述":"JBL扬声器完美地构建"}
jbl扬声器建造和完美的
speakers
或 speaker
,两者都将匹配.同样,如果您搜索 perfect
,则上述文档将匹配.扬声器
或完美
会匹配,这可能是您想到的一个问题.原因是默认情况下,弹性搜索会应用与在搜索时建立索引时所使用的分析器相同的分析器.因此,如果您搜索 perfection
,它将实际上是在搜索 perfect
,从而找到匹配项.{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"prod_group": "06"
}
},
{
"match_phrase": {
"prod_group": "apparel"
}
},
{
"wildcard": {
"prod_cat_for_search": "+speaker*"
}
},
{
"range": {
"date": {
"gte": "2018-04-07"
}
}
}
]
}
}
}
Q('match_phrase', prod_cat_for_search='speaker')
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"my_stemmer"
],
"tokenizer": "whitespace"
}
},
"filter": {
"my_stemmer": {
"type": "stemmer",
"name": "english"
}
}
}
},
"mappings": {
"doc": {
"properties": {
"description": {
"type": "text",
"analyzer": "my_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
description
in above mapping we have used analyzer as my_analyzer
. This analyzer will apply token filters lowercase
and my_stemmer
. The my_stemmer
will apply english
stemming on the input value.{
"description": "JBL speakers build with perfection"
}
jbl
speaker
build
with
perfect
speakers
is indexed as speaker
and perfection
as perfect
.speakers
or speaker
both will match. Similarly, if you search for perfect
the above document will match. speakers
or perfection
will match might be a question arising in your mind. The reason for this is that by default elastic search apply the same analyzer that was used while indexing at the time of searching as well. So if you search for perfection
it will be actually searching for perfect
and hence the match.