在弹性搜索中查询以检索以特定单词开头的字符串 [英] Query in Elasticsearch for retrieving strings that start with a particular word
问题描述
query:{
query_string:{
query: Donald,
字段:['character_name']
}
}
那么结果应该是唐老鸭,而不是艾伦唐纳德,因为在唐老鸭它以唐纳德开头。现在有人可以告诉我如何写这样一个查询,我搜索了很多帖子,但没有找到任何解决方案。
Edit-1
我的映射在下面给出
设置:{
index:{
analysis:{
analyzer:{
simple_wildcard:{
tokenizer:whitespace,
:[smallcase]
}
}
}
}
},
mappings:{
college {
properties:{
character_name:{type:string,index:analyze,analyzer:simple_wildcard}
}
}
}
限制令牌过滤器将非常有用。您可以以两种不同的方式分析 character_name 字段,一种用于标准搜索操作,另一种用于获取以特定单词开头的字符串。我创建了这样的示例索引。 only_first 仅索引字符串的第一个标记。
PUT字符
{
settings:{
analysis:{
analyzer:{
character_analyzer:{
tokenizer:whitespace,
filter:[
smallcase,
one_token_limit
]
}
},
filter:{
one_token_limit:{
type:limit,
max_token_count:1
}
}
}
},
mappings:{
mytype:{
properties:{
character_name:{
type:string,
:{
only_first:{
type:string,
analyzer:character_analyzer
}
}
}
}
}
}
}
然后你好在 only_first 字段中,这样
{
query:{
query_string:{
fields:[character_name.only_first],
query:Donald
}
}
}
这将给您所需的结果。我已经使用空格标记器,但你如果你想要匹配唐纳德·唐纳德鸭,也可以去标准的标记器。
另一种方法是跨第一个查询,但问题是它是一个术语查询
所以'唐纳德'将匹配,但'唐纳德'不匹配
{
span_first:{
match:{
span_term:{character_name:donald}
},
end:1
}
}
但是唐纳德将给你零结果(区分大小写),但第一种方法肯定会起作用。
编辑1 :前缀匹配
您可以在span fi内包装前缀查询rst like this
{
query:{
span_first:{
match:{
span_multi:{
match:{
prefix:{
character_name:{
value
}
}
}
}
},
结束:1
}
}
}
不要使用*在查询中。
希望它有帮助!
I want to write a query in elasticsearch such that it will only give results where string starts from a particular word for example i have one string "Donald Duck" and the other string which is "Alan Donald" now if i will search for "Donald" with below query
"query": {
query_string: {
query: "Donald",
fields: ['character_name']
}
}
then result should be "Donald Duck" not "Alan Donald" because in "Donald Duck" it starts with "Donald". Now can anyone please tell me how can i write such a query, i have searched a lot of posts but haven't found any solution.
Edit-1
My mapping is given below
"settings": {
"index": {
"analysis": {
"analyzer": {
"simple_wildcard": {
"tokenizer": "whitespace",
"filter": ["lowercase"]
}
}
}
}
},
"mappings" : {
"college": {
"properties":{
"character_name" : { "type" : "string", "index": "analyzed", "analyzer": "simple_wildcard"}
}
}
}
Limit Token filter would be very helpful in this particular case. You can analyze character_name field in two different ways, one for standard search operations and other to get the string starting with particular word. I created the sample index like this. only_first indexes only the first token of the string.
PUT character
{
"settings": {
"analysis": {
"analyzer": {
"character_analyzer": {
"tokenizer": "whitespace",
"filter": [
"lowercase",
"one_token_limit"
]
}
},
"filter": {
"one_token_limit": {
"type": "limit",
"max_token_count": 1
}
}
}
},
"mappings": {
"mytype": {
"properties": {
"character_name": {
"type": "string",
"fields": {
"only_first": {
"type": "string",
"analyzer": "character_analyzer"
}
}
}
}
}
}
}
Then you query on the only_first field like this
{
"query": {
"query_string": {
"fields": ["character_name.only_first"],
"query": "Donald"
}
}
}
This will give you the desired results. I have used whitespace tokenizer but you can also go for standard tokenizer if you want to match "donald-donald duck".
Another way is span first query but the problem is it is a term query
so 'donald' will match but 'Donald' wont match
{
"span_first" : {
"match" : {
"span_term" : { "character_name" : "donald" }
},
"end" : 1
}
}
But 'Donald' will give you zero results(case sensitive), but the first approach will definitely work.
EDIT 1 : Prefix Match
You can wrap prefix query inside span first like this
{
"query": {
"span_first": {
"match": {
"span_multi": {
"match": {
"prefix": {
"character_name": {
"value": "don"
}
}
}
}
},
"end": 1
}
}
}
Do not use "*" in query.
Hope it helps!
这篇关于在弹性搜索中查询以检索以特定单词开头的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!