弹性搜索通配符搜索not_analyzed字段 [英] Elasticsearch wildcard search on not_analyzed field
问题描述
我有如下设置和映射的索引:
I have an index like following settings and mapping;
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_keyword":{
"tokenizer":"keyword",
"filter":"lowercase"
}
}
}
}
},
"mappings":{
"product":{
"properties":{
"name":{
"analyzer":"analyzer_keyword",
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
我正在努力在名称
字段上进行通配符搜索的实现。我的示例数据如下:
I am struggling with making an implementation for wildcard search on name
field. My example data like this;
[
{"name": "SVF-123"},
{"name": "SVF-234"}
]
查询;
http://localhost:9200/my_index/product/_search -d '
{
"query": {
"filtered" : {
"query" : {
"query_string" : {
"query": "*SVF-1*"
}
}
}
}
}'
它返回 SVF-123
, SVF-234
。我认为,它仍然标记数据。它只能返回 SVF-123
。
It returns SVF-123
,SVF-234
. I think, it still tokenizes data. It must return only SVF-123
.
你能帮忙吗?
提前感谢
推荐答案
我的解决方案冒险
我的问题。每当我改变我的一部分设置,一部分开始工作,但另一部分停止工作。让我给出我的解决方案历史:
I have started my case as you can see in my question. Whenever, I have changed a part of my settings, one part started to work, but another part stop working. Let me give my solution history:
1。)我将我的数据作为默认索引。这意味着,我的数据是分析
作为默认值。这将导致我的问题。例如;
1.) I have indexed my data as default. This means, my data is analyzed
as default. This will cause problem on my side. For example;
当用户开始搜索如 SVF-1 的关键字时,系统运行此查询:
When user started to search a keyword like SVF-1, system run this query:
{
"query": {
"filtered" : {
"query" : {
"query_string" : {
"analyze_wildcard": true,
"query": "*SVF-1*"
}
}
}
}
}
结果;
SVF-123
SVF-234
这是正常的,因为我的文档的名称
字段是分析
。这将查询分成令牌 SVF
和 1
和 SVF
匹配我的文档,虽然 1
不匹配。我这样跳过了我已经为我的字段创建了一个映射,使他们 not_analyzed
This is normal, because name
field of my documents are analyzed
. This splits query into tokens SVF
and 1
, and SVF
matches my documents, although 1
does not match. I have skipped this way. I have create a mapping for my fields make them not_analyzed
{
"mappings":{
"product":{
"properties":{
"name":{
"type":"string",
"index": "not_analyzed"
},
"site":{
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
但是我的问题仍然存在。
but my problem continued.
2。)我想通过大量的研究尝试另一种方式。决定使用通配符查询。
我的查询是;
2.) I wanted to try another way after lots of research. Decided to use wildcard query. My query is;
{
"query": {
"wildcard" : {
"name" : {
"value" : *SVF-1*"
}
}
},
"filter":{
"term": {"site":"pro_en_GB"}
}
}
}
这个查询有效,但是这里有一个问题,我的字段不再被分析了,我正在进行通配符查询,区分大小写是这里的问题,如果我搜索像 svf-1 ,它不返回任何东西,因为用户可以输入小写版本的查询。
This query worked, but one problem here. My fields are not_analyzed anymore, and I am making wildcard query. Case sensitivity is problem here. If I search like svf-1, it returns nothing. Since, user can input lowercase version of query.
3。
3.) I have changed my document structure to;
{
"mappings":{
"product":{
"properties":{
"name":{
"type":"string",
"index": "not_analyzed"
},
"nameLowerCase":{
"type":"string",
"index": "not_analyzed"
}
"site":{
"type":"string",
"index": "not_analyzed"
}
}
}
}
}
我有为名称
另外添加一个名为 nameLowerCase
的字段。当我索引我的文档时,我正在设置我的文档,如:
I have adde one more field for name
called nameLowerCase
. When I am indexing my document, I am setting my document like;
{
name: "SVF-123",
nameLowerCase: "svf-123",
site: "pro_en_GB"
}
这里,我将查询关键字转换为小写,并对新的 nameLowerCase
索引进行搜索操作。并显示名称
字段。
Here, I am converting query keyword to lowercase and make search operation on new nameLowerCase
index. And displaying name
field.
我的查询的最终版本是;
Final version of my query is;
{
"query": {
"wildcard" : {
"nameLowerCase" : {
"value" : "*svf-1*"
}
}
},
"filter":{
"term": {"site":"pro_en_GB"}
}
}
}
还有一种方法可以使用 multi_field 。我的查询包含破折号( - ),并且遇到一些问题。
Now it works. There is also one way to solve this problem by using multi_field. My query contains dash(-), and faced some problems.
很多谢谢@Alex Brasetvik的详细解释和努力
Lots of thanks to @Alex Brasetvik for his detailed explanation and effort
这篇关于弹性搜索通配符搜索not_analyzed字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!