Elastic search- search_analyzer vs index_analyzer [英] Elastic search- search_analyzer vs index_analyzer

查看:565
本文介绍了Elastic search- search_analyzer vs index_analyzer的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在看
http:// euphonious-intuition。 com / 2012/08 / more-complexity-mapping-in-elasticsearch /
解释了ElasticSearch分析器。



我不明白部分关于有不同的搜索和索引分析器。
自定义映射的第二个例子如下所示:

- >索引分析器是edgeNgram

- >搜索分析器是:

 full_name:{
filter:[
standard,
smallcase,
asciifolding
],
type:custom,
tokenizer:standard
}
/ pre>

如果我们希望查询Race不返回结果,如* ra * pport和* rac

请用不同分析器有用的例子进行说明。

$ b $您通常在索引时间和查询时间都有类似的分析链。b

解决方案

类似的并不意味着完全相同,但通常您索引文档的方式反映了您查询它们的方式。



ngram示例是一个很好的配合,因为它是您在索引和查询时间使用不同分析器的主要原因之一。



对于部分匹配,您可以使用边缘数据索引,以使elasticsearch成为(与mingram 3和maxgram 20):



ela,elas,elast,elasti,elastic,elastics,elasticse弹性鞋,弹性鞋,eleasticsearc和弹性鞋



现在我们来查询创建的字段。如果我们查询弹性这个术语有一个匹配,我们可以得到预期的结果。我们基本上成为我们所说的以上部分匹配的完全匹配,给定了我们索引。没有必要对查询应用ngram。如果我们会查询以下所有条款:



ela,elas,elast,elasti和elastic



这将使查询方式更加复杂,并会导致奇怪的结果。假设您在另一个文档中指定术语已过去,即同一字段。您将具有以下几种:



ela,elap,elaps,经过,已过



如果搜索弹性并使查询成为ngram,则术语ela也与第二个文档匹配,因此您可以将其与第一个文档一起使用,即使没有条款包含整个弹性术语你正在寻找。



我建议你看看分析api ,使用不同的分析器和不同的结果进行分析。


I was looking at http://euphonious-intuition.com/2012/08/more-complicated-mapping-in-elasticsearch/ which explains ElasticSearch analyzers.

I did not understand the part about having different search and index analyzers. The second example of custom mapping goes like this:
->the index analyzer is an edgeNgram
->the search analyzer is:

"full_name":{
    "filter":[
        "standard",
        "lowercase",
        "asciifolding"
    ],
    "type":"custom",
    "tokenizer":"standard"
}

if we wanted the query "Race" to not return results like *ra*pport and *rac*ial due to edgeNgram, why index it with edgeNgram in the first place?

Please explain with an example where different analyzers are useful.

解决方案

You usually have similar analysis chain at both index time and query time. Similar doesn't mean exactly the same, but usually the way you index documents reflects the way you query them.

The ngrams example is a really good fit though, since it's one of the main reasons why you would use different analyzers at index and query time.

For partial matches you index with edge ngrams, so that "elasticsearch" becomes (with mingram 3 and maxgram 20):

"ela", "elas","elast","elasti","elastic","elastics","elasticse","elasticsea","elasticsear","eleasticsearc" and "elasticsearch"

Let's now query the created field. If we query for the term "elastic" there's a match and we get back the expected result. We basically made become what we called above partial match an exact match, given what we indexed. There's no need to apply ngrams to the query too. If we did we would query for all the following terms:

"ela", "elas","elast","elasti" and "elastic"

That would make the query way more complex and would lead to get weird results as well. Let's say you index the term "elapsed" in another document, same field. You would have the following ngrams:

"ela", "elap", "elaps", "elapse", "elapsed"

If you search for "elastic" and make ngrams to the query, the term "ela" would match this second document too, thus you would get it back together with the first document, even though no terms contain the whole "elastic" term you were looking for.

I would suggest you to have a look at the analyze api to play around around with different analyzer and their different results.

这篇关于Elastic search- search_analyzer vs index_analyzer的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆