是否有一个Elasticsearch插件类似于Solr分析工具? [英] Is there a Elasticsearch plugin similar to the Solr analysis tool?

查看:244
本文介绍了是否有一个Elasticsearch插件类似于Solr分析工具?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Solr具有内置的完全包含我想要的功能(向下滚动到myAnalyzer),但不幸的是,这不是我可以在我的索引上运行的。但是它表明这样的功能是可能的。



编辑:我知道有很多插件显示了一个完整的过滤器链的输出,例如kopf由用户@Bass:





这不是我想要的!我想看到每个过滤器的输出,不仅是最终的结果。 b $ b

解决方案

有一个独立的工具叫做 elyzer 由OpenSource Connections的好人组成。该工具将在分析过程的任何步骤(char filter,tokenizer,token filter)中显示您的令牌状态,使用起来非常简单。



安装通过 pip install elyzer 非常简单,然后您可以将其用作命令行工具,例如

  $ elyzer --eshttp:// localhost:9200--index tmdb --analyzer english_bigrams --textMary has a little lamb
TOKENIZER:standard
{1:Mary} {2:had} {3:a} {4:little} {5:lamb}
TOKEN_FILTER:standard
{1:Mary} {2:had} {3 :a} {4:little} {5:lamb}
TOKEN_FILTER:小写
{1:mary} {2:had} {3:a} {4:little} {5:lamb}
TOKEN_FILTER:porter_stem
{1:mari} {2:had} {3:a} {4:littl} {5:lamb}
TOKEN_FILTER:bigram_filter
{1:mari已经} {2:有一个} {3:一个littl} {4:littl lamb}


Solr has the built-in "Analysis Screen", which helps to debug the interplay between tokenizers and filters for specific field types:

Is there a plugin for ElasticSearch that does something similar? Specifically, I want to see the input/ output of each filter, not only the end result of the analysis chain. I used Google quite intensively on this, but didn't find anything.

https://www.found.no/play/#analysis contains exactly the feature I want (scroll down to "myAnalyzer"), but unfortunately it's not something I can run on my index. But it shows that such a feature is possible.

Edit: I know there are many plugins that show me the output for a complete chain of filters, for example kopf as suggested by user @Bass:

This is not what I want! I want to see the output of each filter, not only the end result.

解决方案

There is one standalone tool called elyzer made by the nice folks at OpenSource Connections. That tool will show you the state of your tokens at any step (char filter, tokenizer, token filter) of the analysis process and it is very simple to use.

Installing it is very simple via pip install elyzer and then you can use it as a command-line tool, e.g.

$ elyzer --es "http://localhost:9200" --index tmdb --analyzer english_bigrams --text "Mary had a little lamb"
TOKENIZER: standard
{1:Mary}    {2:had} {3:a}   {4:little}  {5:lamb}    
TOKEN_FILTER: standard
{1:Mary}    {2:had} {3:a}   {4:little}  {5:lamb}    
TOKEN_FILTER: lowercase
{1:mary}    {2:had} {3:a}   {4:little}  {5:lamb}    
TOKEN_FILTER: porter_stem
{1:mari}    {2:had} {3:a}   {4:littl}   {5:lamb}    
TOKEN_FILTER: bigram_filter
{1:mari had}    {2:had a}   {3:a littl} {4:littl lamb}  

这篇关于是否有一个Elasticsearch插件类似于Solr分析工具?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆