在Lucene索引中存储带撇号的单词 [英] Storing words with apostrophe in Lucene index

查看:89
本文介绍了在Lucene索引中存储带撇号的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Lucene Index中有一个公司领域. 索引中的公司名称之一是:穆迪(Moody's)

I've a company field in Lucene Index. One of the company names indexed is : Moody's

当用户键入以下任何关键字时,我希望该公司出现在搜索结果中. 1.Mo 2.心情 3,喜怒无常 4.穆迪(Moody's)

When user types in any of the following keywords,I want this company to come up in search results. 1.Moo 2.Mood 3.Moodys 4.Moody's

我应该如何在Lucene中存储该索引,以及应该使用哪种类型的Lucene查询来获得这种行为?

How should I store this index in Lucene and what type of Lucene Query should I use to get this behaviour?

谢谢.

推荐答案

根据您的澄清,我想将您的问题一分为二,然后依次回答:

Based on your clarifications, I want to divide your question into two, and answer each in turn:

  1. 如何将带有撇号的单词索引为等同于没有撇号的相似单词?例如将 Moodys Moody's 映射到相同的索引项.
  2. 如何在Lucene中实现自动完成搜索-即给定索引,使用单词前缀查找文档,例如将 Moo 映射到 Moodys ?
  1. How do I index words with apostrophes as equivalent to similar words without an apostrophe? e.g. mapping Moodys and Moody's to the same index term.
  2. How do I implement auto-complete search in Lucene - i.e. given an index, find documents using word prefixes, e.g. map Moo to Moodys ?

1相对容易-使用 StandardAnalyzer 执行此操作以及更多内容(小写和停止单词删除),可能超出您的需要.使用词干提取器应将 Moodys Moody 都置于同一标记中.尝试 SnowBallFilter 为此.

1 is relatively easy - Use a StandardToeknizer to create a token combining the apostrophe and s with the previous word, then a StandardFilter to remove the apostrophe and s. This will convert Moody's to Moody. A StandardAnalyzer does this and much more (lowercasing and stop word removal), which may be more than you need. Using a stemmer should take both Moodys and Moody to the same token. Try SnowBallFilter for this.

2更难:Lucene的 PrefixQuery <艾伦(Alan)提到的/a>仅在公司名称是字段中的第一个单词时才起作用.您需要类似此自动完成问题的答案Lucene .

2 is harder: Lucene's PrefixQuery, to which Alan alluded, will only work when the company name is the first word in a field. You need something like the answer to this question about auto-complete in Lucene.

这篇关于在Lucene索引中存储带撇号的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆