ES 分析器,它也标记数字、数字 [英] ES Analyzer which tokens the numbers, digits as well

查看:37
本文介绍了ES 分析器,它也标记数字、数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Elasticsearch 内置的简单分析器 https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-simple-analyzer.html,使用小写分词器.和文本 apple 8 IS Awesome 被标记为以下格式.

I am using Elasticsearch in-built Simple analyzer https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-simple-analyzer.html, which uses Lower Case Tokenizer. and text apple 8 IS Awesome is tokenized in the below format.

 "apple",
 "is",
 "awesome"

您可以清楚地看到,它没有对数字 8 进行标记,因此现在如果我只使用 8 进行搜索,我的消息将不会出现在搜索中.

You can clearly see, that it misses tokenizing the number 8, hence now if I just search with 8, my message will not appear in search.

我浏览了 ES 提供的所有可用分析器,但找不到任何符合我要求的合适分析器.

I went through all the available analyzer available with ES but couldn't find any suitable analyzer which matches my requirement.

如何使用 ES 的自定义或内置分析器用数字标记所有单词?

How can I tokenize all the words with a number using a custom or in-built analyzer of ES ?

推荐答案

您的问题是关于简单的分析器,但您提到了一个非常古老的文档链接.尝试https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simple-analyzer.html

Your question is about the simple analyzer, but you mention a very old link to documentation. Try https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simple-analyzer.html

就像 Val 告诉你的那样,你可能正在寻找 标准分析仪.如果您想查看差异,请尝试分析API:

Like Val told you, you probably looking for the standard analyser. If you want to see the difference try the analysis api:

这篇关于ES 分析器,它也标记数字、数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆