创建时Elasticsearch分析器工作正常,但会抛出Springdata,但直接从Postman/curl创建时却失败 [英] Elasticsearch analizer working when created throw Springdata but failing when creating straight from Postman/curl

查看:40
本文介绍了创建时Elasticsearch分析器工作正常,但会抛出Springdata,但直接从Postman/curl创建时却失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目标:创建旨在加载1000万个简单文档的Elasticsearch索引.每个文档基本上都是"Elastisearch ID",某些公司ID"和名称".提供按需搜索类型的功能.

我可以直接从Postman(curl或任何其他不依赖Spring Data的工具)或在Spring启动初始化期间成功创建索引和分析器.但是,当我尝试使用分析器时,似乎对于Postman直接创建的分析器却被忽略了.

所以我的主要问题是:当我尝试直接发布json stting时,Springdata是否添加了一些我缺少的设置?第二个问题是:是否有某种方式可以使Springdata打印自动生成并执行的命令(类似于Hibernate的类似方法,您可以看到打印的命令)?如果是这样,我可以直观地调试并检查有什么不同.

这是从Springboot/Spring-Data创建索引和分析器的方式.

主要启动方法

  @EnableElasticsearchRepositories@SpringBootApplication公共类SearchApplication {公共静态void main(String [] args){SpringApplication.run(SearchApplication.class,args);}} 

我的模特

  @Document(indexName ="correntistas")@Setting(settingPath ="data/es-config/elastic-setting.json")@盖特@Setter公共类Correntista {@ID私有字符串ID;私有String conta;私有String sobrenome;@Field(类型= FieldType.Text,分析器="autocomplete_index",searchAnalyzer ="autocomplete_search")私有String nome;} 

src/main/resources/data/es-config/elastic-setting.json***注意,这与我从邮递员发布的设置完全相同

  {分析": {筛选": {"autocomplete_filter":{"type":"edge_ngram","min_gram":1"max_gram":20}},分析器":{"autocomplete_search":{"type":"custom","tokenizer":标准",筛选": [小写"]},"autocomplete_index":{"type":"custom","tokenizer":标准",筛选": [小写","autocomplete_filter"]}}}} 

检查是否成功创建了

 获取http://localhost:9200/correntistas/_settings{"correntistas":{设置":{指数": {"number_of_shards":"5","provided_name":"correntistas","creation_date":"1586615323459",分析": {筛选": {"autocomplete_filter":{"type":"edge_ngram","min_gram":"1","max_gram":"20"}},分析器":{"autocomplete_index":{筛选": [小写","autocomplete_filter"],"type":"custom","tokenizer":标准"},"autocomplete_search":{筛选": [小写"],"type":"custom","tokenizer":标准"}}},"number_of_replicas":"1","uuid":"xtN-NOX3RQWJjeRdyC8CVA",版本": {创建":"6080499"}}}}} 

到目前为止一切都很好.

现在,我使用curl -XDELETE localhost:9200/correntistas删除索引,我将执行相同的想法,但是立即从Postman创建索引和分析器:

使用上面发布的完全相同的分析器输入

然后,如果我检查设置,我会看到与上面从Spring-Data创建的设置完全相同的结果.

我是否错过了Spring-Data通过自由遮挡眼睛而付出的额外步骤?

总而言之,当从Spring数据创建时,我看到搜索时很少有字母起作用,但是当我从邮递员创建时,当我搜索整个单词时,它只是检索数据.

***由于有Opster Elasticsearch Ninja的友好友好的帮助,我可以在此处添加从邮递员发帖时学到的额外技巧(以某种方式,我的邮递员中启用的某些标头崩溃时显示为"...根映射定义具有不支持的参数... mapper_parsing_exception ..."在尝试解决方案时回答如下.我想在此添加以供将来的读者使用可能是有用的.

解决方案

由于您没有提供要在邮递员中使用的搜索查询,因此还提供了映射(如果您使用的权利不正确,这将有助于我们进行调试)您在搜索查询中使用的字段分析器.此外,添加示例文档以及您的实际和预期搜索结果总是有帮助的.

Nvm,我添加了您的映射并在下面显示,以及如何使用邮递员,您将获得正确的结果.

索引定义与您的索引定义完全相同

  {设置":{分析":{过滤器":{"autocomplete_filter":{"type":"edge_ngram","min_gram":1,"max_gram":20}},分析器":{"autocomplete_search":{"type":"custom","tokenizer":标准",过滤器":[小写"]},"autocomplete_index":{"type":"custom","tokenizer":标准",过滤器":[小写","autocomplete_filter"]}}}},映射":{属性":{名称":{"type":"text","analyzer":"autocomplete_index","search_analyzer":"autocomplete_search"}}}} 

索引示例文档

  {名称":"opster";}{名称":"jim c";}{名称":"jimc"}{名称":" foo"} 

搜索部分词(例如 ji )会同时带来 jim c jimc 文档

  {查询":{匹配":{名称":{"query":"ji"}}}} 

结果

 "hits":[{"_index":"61158504","_type":"_ doc","_id":"2","_score":0.69263697,"_source":{名称":"jimc"}},{"_index":"61158504","_type":"_ doc","_id":"1","_score":0.6133945,"_source":{名称":"jim c"}}] 

Goal: create Elasticsearch index aimed to be loaded with 10 million simple documents. Each document is basically "Elastisearch id", "some company id" and "name". Provide search-as-suer-type feature.

I could created successfully an index and an analyzer either straight from Postman (curl or any other tool not relying on Spring Data) or during Spring boot initialization. Nevertheless, when I try to use the analizer it seems it is ignored for the one created straight from Postman.

So my main question is: is Springdata adding some setting I am missing when I try straight from posting the json stting? A secondary question is: is there someway to enable Springdata to print the commands auto-generated and executed (kind of similar approach with Hibernate whihc allows you to see the commands printed)? If so, I can visually debug and check what is different.

This is the way creting Index and Analyzer from Springboot/Spring-Data.

main method to boot

@EnableElasticsearchRepositories
@SpringBootApplication
public class SearchApplication {

    public static void main(String[] args) {
        SpringApplication.run(SearchApplication.class, args);
    }

}

my model

@Document(indexName = "correntistas")
@Setting(settingPath = "data/es-config/elastic-setting.json")
@Getter
@Setter
public class Correntista {
    @Id
    private String id;
    private String conta;
    private String sobrenome;

    @Field(type = FieldType.Text, analyzer = "autocomplete_index", searchAnalyzer = "autocomplete_search")
    private String nome;
}

src/main/resources/data/es-config/elastic-setting.json *** NOTE THIS IS EXACTLY THE SAME SETTING I AM POSTING FROM POSTMAN

{
  "analysis": {
    "filter": {
      "autocomplete_filter": {
        "type": "edge_ngram",
        "min_gram": 1,
        "max_gram": 20
      }
    },
    "analyzer": {
      "autocomplete_search": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "lowercase"
        ]
      },
      "autocomplete_index": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "lowercase",
          "autocomplete_filter"
        ]
      }
    }
  }
}

Checking if it was created succesfully I see:

get http://localhost:9200/correntistas/_settings

{
    "correntistas": {
        "settings": {
            "index": {
                "number_of_shards": "5",
                "provided_name": "correntistas",
                "creation_date": "1586615323459",
                "analysis": {
                    "filter": {
                        "autocomplete_filter": {
                            "type": "edge_ngram",
                            "min_gram": "1",
                            "max_gram": "20"
                        }
                    },
                    "analyzer": {
                        "autocomplete_index": {
                            "filter": [
                                "lowercase",
                                "autocomplete_filter"
                            ],
                            "type": "custom",
                            "tokenizer": "standard"
                        },
                        "autocomplete_search": {
                            "filter": [
                                "lowercase"
                            ],
                            "type": "custom",
                            "tokenizer": "standard"
                        }
                    }
                },
                "number_of_replicas": "1",
                "uuid": "xtN-NOX3RQWJjeRdyC8CVA",
                "version": {
                    "created": "6080499"
                }
            }
        }
    }
}

So far so good.

Now I delete the index with curl -XDELETE localhost:9200/correntistas and I will do the same idea but creating the index and analyzer at once from Postman:

put http://localhost:9200/correntistas with exact same analyzer posted above:

Then if I check I the settings I see exact the same result as it was created above from Spring-Data.

Am I missing some extra step that Spring-Data is giving by free and hiding from eyes?

To sum up, when created from Spring-data I see searching with few letters working but when Icreated from postman it simply retrieve data when I search with whole word.

*** Thanks to so friendly and smart help from Opster Elasticsearch Ninja I can add here an extra trick I had learned when posting from Postman (somehow some header enabled in my Postman was crashing with "... Root mapping definition has unsupported parameters... mapper_parsing_exception..." while trying the solution answered bellow. I guess it can be usefull to add here for future readers.

解决方案

As you have not provided your search query which you are using in postman, also the mapping, which would help us to debug, if you are not using the right analyzer on the fields, you are using in your search query. Also adding sample documents and your actual and expected search results always help.

Nvm, I added your mapping and showing below, how that using postman as well, you will get the correct results.

Index def exactly same as yours

{
    "settings": {
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type": "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete_search": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase"
                    ]
                },
                "autocomplete_index": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "name": {
                "type": "text",
                "analyzer": "autocomplete_index",
                "search_analyzer": "autocomplete_search"
            }
        }
    }
}

Index sample docs

{
    "name" : "opster"
}

{
    "name" : "jim c"
}

{
    "name" : "jimc"
}

{
    "name" : "foo"
}

Searching for partial words like ji brings both jim c and jimc docs

{
    "query": {
        "match": {
            "name": {
                "query": "ji"
            }
        }
    }
}

Result

  "hits": [
            {
                "_index": "61158504",
                "_type": "_doc",
                "_id": "2",
                "_score": 0.69263697,
                "_source": {
                    "name": "jimc"
                }
            },
            {
                "_index": "61158504",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.6133945,
                "_source": {
                    "name": "jim c"
                }
            }
        ]

这篇关于创建时Elasticsearch分析器工作正常,但会抛出Springdata,但直接从Postman/curl创建时却失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆