Lucene索引忽略撇号 [英] Lucene Indexing to ignore apostrophes

查看：85 发布时间：2020/5/4 7:45:01 lucene

本文介绍了Lucene索引忽略撇号的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个可能在其中带有撇号的字段. 我希望能够: 1.将值原样存储在索引中 2.根据忽略任何撇号的值进行搜索.

I have a field that might have apostrophes in it. I want to be able to: 1. store the value as is in the index 2. search based on the value ignoring any apostrophes.

我正在考虑使用:

   doc.add(new Field("name", value, Store.YES, Index.NO));
   doc.add(new Field("name", value.replaceAll("['‘’`]",""), Store.NO, Index.ANALYZED));

如果我随后在搜索时执行相同的替换操作，我猜它应该可以工作，并使用清除的值索引/搜索，并按原样显示该值.

if I then do the same replace when searching I guess it should work and use the cleared value to index/search and the value as is for display.

我在这里还有其他考虑吗?

am I missing any other considerations here ?

推荐答案

直接在值上执行replaceAll在Lucene中是不好的做法，因为将标记化配方封装在Analyzer中会更好.我也看不到在您的用例中添加字段的好处(请参阅

Performing replaceAll directly on the value its a bad practice in Lucene, since it would a much better practice to encapsulate your tokenization recipe in an Analyzer. Also I don't see the benefit of appending fields in your use case (See Document.add).

如果您想存储原始值，并且仍能够在不带撇号的情况下进行搜索，则只需像下面这样声明您的字段即可:

If you want to Store the original value and yet be able to search without the apostrophes simply declare your field like this:

doc.add(new Field("name", value, Store.YES, Index.ANALYZED);

然后只需挂接一个将替换撇号的自定义Tokenizer(我认为Lucene的StandardAnalyzer已经包含此转换).

Then simply hook up a custom Tokenizer that will replace apostrophes (I think the Lucene's StandardAnalyzer already includes this transformation).

如果要使用突出显示来存储字段，则还应该考虑使用Field.TermVector.WITH_POSITIONS_OFFSETS.

If you are storing the field with the aim of using highlighting you should also consider using Field.TermVector.WITH_POSITIONS_OFFSETS.

这篇关于Lucene索引忽略撇号的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Lucene索引忽略撇号 [英] Lucene Indexing to ignore apostrophes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Lucene索引忽略撇号 [英] Lucene Indexing to ignore apostrophes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭