在Lucene索引中存储带撇号的单词 [英] Storing words with apostrophe in Lucene index

查看：89 发布时间：2020/5/4 7:37:48 lucene lucene.net

本文介绍了在Lucene索引中存储带撇号的单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在Lucene Index中有一个公司领域. 索引中的公司名称之一是:穆迪(Moody's)

I've a company field in Lucene Index. One of the company names indexed is : Moody's

当用户键入以下任何关键字时，我希望该公司出现在搜索结果中. 1.Mo 2.心情 3，喜怒无常 4.穆迪(Moody's)

When user types in any of the following keywords,I want this company to come up in search results. 1.Moo 2.Mood 3.Moodys 4.Moody's

我应该如何在Lucene中存储该索引，以及应该使用哪种类型的Lucene查询来获得这种行为?

How should I store this index in Lucene and what type of Lucene Query should I use to get this behaviour?

谢谢.

推荐答案

根据您的澄清，我想将您的问题一分为二，然后依次回答:

Based on your clarifications, I want to divide your question into two, and answer each in turn:

如何将带有撇号的单词索引为等同于没有撇号的相似单词?例如将 Moodys 和 Moody's 映射到相同的索引项.
如何在Lucene中实现自动完成搜索-即给定索引，使用单词前缀查找文档，例如将 Moo 映射到 Moodys ?

How do I index words with apostrophes as equivalent to similar words without an apostrophe? e.g. mapping Moodys and Moody's to the same index term.
How do I implement auto-complete search in Lucene - i.e. given an index, find documents using word prefixes, e.g. map Moo to Moodys ?

1相对容易-使用 StandardAnalyzer 执行此操作以及更多内容(小写和停止单词删除)，可能超出您的需要.使用词干提取器应将 Moodys 和 Moody 都置于同一标记中.尝试 SnowBallFilter 为此.

1 is relatively easy - Use a StandardToeknizer to create a token combining the apostrophe and s with the previous word, then a StandardFilter to remove the apostrophe and s. This will convert Moody's to Moody. A StandardAnalyzer does this and much more (lowercasing and stop word removal), which may be more than you need. Using a stemmer should take both Moodys and Moody to the same token. Try SnowBallFilter for this.

2更难:Lucene的 PrefixQuery <艾伦(Alan)提到的/a>仅在公司名称是字段中的第一个单词时才起作用.您需要类似此自动完成问题的答案Lucene .

2 is harder: Lucene's PrefixQuery, to which Alan alluded, will only work when the company name is the first word in a field. You need something like the answer to this question about auto-complete in Lucene.

这篇关于在Lucene索引中存储带撇号的单词的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Lucene索引中存储带撇号的单词 [英] Storing words with apostrophe in Lucene index

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在Lucene索引中存储带撇号的单词 [英] Storing words with apostrophe in Lucene index

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭