忽略Elasticsearch中的空格 [英] Ignore spaces in Elasticsearch
问题描述
对于我的搜索,我想考虑以下事实:在过滤器请求中,空格"
字符不是必需的.
例如:
当我在"THE ONE"
上进行过滤时,我看到了相应的文档.
即使写"THEONE"
,我也想看.
这就是我今天的查询的构建方式:
For my search I want to take into account the fact that the "space"
character is not mandatory in a filter request.
For exemple:
when I filter on "THE ONE"
I see the corresponding document.
I want to see it even if I write "THEONE"
.
This is how my query is built today:
boolQueryBuilder.must(QueryBuilders.boolQuery()
.should(QueryBuilders.wildcardQuery("description", "*" +
searchedWord.toLowerCase() + "*"))
.should(QueryBuilders.wildcardQuery("id", "*" +
searchedWord.toUpperCase() + "*"))
.should(QueryBuilders.wildcardQuery("label", "*" +
searchedWord.toUpperCase() + "*"))
.minimumShouldMatch("1"));
What I want is to add this filter: (Writing a space-ignoring autocompleter with ElasticSearch)
"word_joiner": {
"type": "word_delimiter",
"catenate_all": true
}
但是我不知道如何使用API来做到这一点.有什么主意吗?
谢谢!
编辑:根据@ raam86的建议,我添加了自己的自定义分析器:
But I don't know how to do this using the API.
Any idea?
Thanks!
EDIT: Following @raam86 suggestion, I added my own custom analyzer:
{
"index": {
"number_of_shards": 1,
"analysis": {
"filter": {
"word_joiner": {
"type": "word_delimiter",
"catenate_all": true
}
},
"analyzer": {
"custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"word_joiner"
]
}
}
}
}
}
这是文档:
@Document(indexName = "cake", type = "pa")
@Setting(settingPath = "/elasticsearch/config/settings.json")
public class PaElasticEntity implements Serializable {
@Field(type = FieldType.String, analyzer = "custom_analyzer")
private String maker;
}
仍然无法正常工作...
Still not working...
推荐答案
您需要 1.使用设置创建索引
PUT joinword
{
"settings": {
"analysis": {
"filter": {
"word_joiner": {
"type": "shingle",
"output_unigrams": "true",
"token_separator": ""
}
},
"analyzer": {
"word_join_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"word_joiner"
]
}
}
}
}
}
2.检查分析仪是否按预期工作
GET joinword/_analyze?pretty
{
"analyzer": "word_join_analyzer",
"text": "ONE TWO"
}
输出:
{
"tokens" : [ {
"token" : "one",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
}, {
"token" : "onetwo",
"start_offset" : 0,
"end_offset" : 7,
"type" : "shingle",
"position" : 0
}, {
"token" : "two",
"start_offset" : 4,
"end_offset" : 7,
"type" : "<ALPHANUM>",
"position" : 1
} ]
}
因此,现在您可以按 one
, two
或 onetwo
查找该文档.搜索将不区分大小写.
So now you can find this document by one
, two
or onetwo
. A search will be case insensitive.
完整项目可在GitHub上获得.
实体:
@Document(indexName = "document", type = "document", createIndex = false)
@Setting(settingPath = "elasticsearch/document_index_settings.json")
public class DocumentES {
@Id()
private String id;
@Field(type = String, analyzer = "word_join_analyzer")
private String title;
public DocumentES() {
}
public DocumentES(java.lang.String title) {
this.title = title;
}
public java.lang.String getId() {
return id;
}
public void setId(java.lang.String id) {
this.id = id;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
@Override
public java.lang.String toString() {
return "DocumentES{" +
"id='" + id + '\'' +
", title='" + title + '\'' +
'}';
}
}
主要
@SpringBootApplication
@EnableConfigurationProperties(value = {ElasticsearchProperties.class})
public class Application implements CommandLineRunner {
@Autowired
ElasticsearchTemplate elasticsearchTemplate;
public static void main(String[] args) {
SpringApplication.run(Application.class);
}
@Override
public void run(String... args) throws Exception {
elasticsearchTemplate.createIndex(DocumentES.class);
elasticsearchTemplate.putMapping(DocumentES.class);
elasticsearchTemplate.index(new IndexQueryBuilder()
.withIndexName("document")
.withType("document")
.withObject(new DocumentES("ONE TWO")).build()
);
Thread.sleep(2000);
NativeSearchQuery query = new NativeSearchQueryBuilder()
.withIndices("document")
.withTypes("document")
.withQuery(matchQuery("title", "ONEtWO"))
.build();
List<DocumentES> result = elasticsearchTemplate.queryForList(query, DocumentES.class);
result.forEach (System.out::println);
}
}
这篇关于忽略Elasticsearch中的空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!