我可以在Solr中使用n-gram过滤器保护短词吗? [英] Can I protect short words from an n-gram filter in Solr?

查看:66
本文介绍了我可以在Solr中使用n-gram过滤器保护短词吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到这个问题有关在Solr中搜索短词的问题.我想知道是否有另一种可能的解决方案来解决类似的问题.我正在使用带有3的minGramSize的EdgeNGramFilter.我想保护一组特定的较短单词(主要是两个字母的缩写),但我想将minGramSize设置为3,以防止其他所有情况. EdgeNGramFilter不支持受保护的单词列表.是否有任何过滤器或设置可以在单个字段类型中实现此目的,还是我需要编写一个过滤器或设置?

或者,我在想这是错误的方式吗?

解决方案

对此进行了认真的思考,但是您提到的另一个问题的答案似乎是唯一的方法.不过,这对于EdgeNGramFilter来说将是一个有用的功能.

现在,您可以保留一个复制字段和一个 KeepWordFilterFactory 仅带有您需要的首字母缩写词.或者,如果您的首字母缩写词列表不是先验的,请使用 LengthFilter . /p>

I have seen this question about searching for short words in Solr. I am wondering if there is another possible solution to a similar problem. I am using the EdgeNGramFilter with a minGramSize of 3. I want to protect a specific set of shorter words (two-letter acronyms, mainly) from being ignored, but I'd like to keep that minGramSize of 3 for everything else. EdgeNGramFilter doesn't support a protected words list. Is there any filter or setting that makes this possible within a single field type, or will I need to write one?

Or, am I thinking about this the wrong way?

解决方案

Thought hard about this one, but the answer in the other question you mention seems to be the only way. This will be a useful feature for the EdgeNGramFilter though.

For now, you can keep a copy field and a KeepWordFilterFactory for it with only the acronyms you need. Or if your list of acronyms is not know a priori, use a LengthFilter.

这篇关于我可以在Solr中使用n-gram过滤器保护短词吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆