ElasticSearch n-gram tokenfilter没有找到部分单词 [英] ElasticSearch n-gram tokenfilter not finding partial words
问题描述
索引:
分析:
分析器:
default_index:
tokenizer:standard
filter:[standard,stop,mynGram]
default_search:
tokenizer:standard
filter:[standard,stop]
过滤器:
mynGram:
类型:nGram
min_gram:1
max_gram:10
我创建了一个新的索引并添加了以下文档:
$ curl -XPUT http:// localhost:9200 / test / newtype / 3 -d'{text:one two three four five six}'
{ok:true,_ index test,_ type:newtype,_ id:3}
当我使用查询文本:hree
或 text:ive
或任何其他部分条款搜索时,ElasticSearch不会返回文件。只有当我搜索确切的术语(如 text:two
)时,它才返回该文档。
我也尝试更改配置文件,以使default_search也使用ngram令牌过滤器,但结果是一样的。我在这里做错了什么,如何纠正?
不确定default_ *设置。
但应用指定index_analyzer和search_analyzer的映射工作:
curl -XDELETE localhost:9200 / twitter
curl -XPOST localhost:9200 / twitter -d'
{index:
{number_of_shards:1,
analysis:{
filter
mynGram:{type:nGram,min_gram:2,max_gram:10}
},
analyzer:{a1 b $ btype:custom,
tokenizer:standard,
filter:[smallcase,mynGram]
}
}
}
}
}
}'
curl -XPUT localhost:9200 / twitter / tweet / _mapping -d'{
tweet:{
index_analyzer:a1,
search_analyzer:standard,
date_formats:[yyyy-MM-dd,dd-MM- yyyy],
properties:{
user:{type:string,analyzer:stan dard},
message:{type:string}
}
}}'
curl -XPUT'http:// localhost :9200 / twitter / tweet / 1'-d'{
user:kimchy,
post_date:2009-11-15T14:12:12,
消息:尝试弹性搜索
}'
curl -XGET localhost:9200 / twitter / _search?q = ear
curl -XGET localhost:9200 / twitter / _search?q = sea
curl -XGET localhost:9200 / twitter / _mapping
I have been playing around with ElasticSearch for a new project of mine. I have set the default analyzers to use the ngram tokenfilter. This is my elasticsearch.yml file:
index:
analysis:
analyzer:
default_index:
tokenizer: standard
filter: [standard, stop, mynGram]
default_search:
tokenizer: standard
filter: [standard, stop]
filter:
mynGram:
type: nGram
min_gram: 1
max_gram: 10
I created a new index and added the following document to it:
$ curl -XPUT http://localhost:9200/test/newtype/3 -d '{"text": "one two three four five six"}'
{"ok":true,"_index":"test","_type":"newtype","_id":"3"}
However, when I search using the query text:hree
or text:ive
or any other partial terms, ElasticSearch does not return this document. It returns the document only when I search for the exact term (like text:two
).
I have also tried changing the config file such that default_search also uses the ngram token filter, but the result was the same. What am I doing wrong here and how do I correct it?
Not sure about the default_* settings. But applying a mapping that specifies index_analyzer and search_analyzer works:
curl -XDELETE localhost:9200/twitter
curl -XPOST localhost:9200/twitter -d '
{"index":
{ "number_of_shards": 1,
"analysis": {
"filter": {
"mynGram" : {"type": "nGram", "min_gram": 2, "max_gram": 10}
},
"analyzer": { "a1" : {
"type":"custom",
"tokenizer": "standard",
"filter": ["lowercase", "mynGram"]
}
}
}
}
}
}'
curl -XPUT localhost:9200/twitter/tweet/_mapping -d '{
"tweet" : {
"index_analyzer" : "a1",
"search_analyzer" : "standard",
"date_formats" : ["yyyy-MM-dd", "dd-MM-yyyy"],
"properties" : {
"user": {"type":"string", "analyzer":"standard"},
"message" : {"type" : "string" }
}
}}'
curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}'
curl -XGET localhost:9200/twitter/_search?q=ear
curl -XGET localhost:9200/twitter/_search?q=sea
curl -XGET localhost:9200/twitter/_mapping
这篇关于ElasticSearch n-gram tokenfilter没有找到部分单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!