Lucene.net和部分“开始于"词组搜索 [英] Lucene.net and partial "starts with" phrase search

查看:67
本文介绍了Lucene.net和部分“开始于"词组搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望在大量的城市名称上构建一个自动完成的文本框.搜索功能如下:我想对多词短语进行开头为"搜索.例如,如果用户键入芝加哥他",则仅需要返回诸如芝加哥高地"之类的位置.
我正在尝试使用Lucene.我在理解如何实现这一点时遇到了问题.

I'm looking to build an auto-complete textbox over a large quantity of city names. Search functionality is as follows: I want a "Starts with" search over a multi-word phrase. For example, if user has typed in "chicago he", only locations such as "Chicago Heights" need to be returned.
I'm trying to use Lucene for this. I'm having issues understanding how this needs to be implemented.

我尝试了我认为应该可行的方法:

I've tried what I think is the approach that should work:

我已经用KeywordAnalyzer索引了位置(我已经尝试了TOKENIZED和UN_TOKENIZED):

I've indexed locations with KeywordAnalyzer (I've tried both TOKENIZED and UN_TOKENIZED):

doc.Add(new Field("Name", data.ToLower(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.NO));

并通过以下内容搜索它们(我也尝试过其他各种查询/分析器/等):

And search for them via the following (I've also tried a variety of other queries/analyzers/etc):

var luceneQuery = new BooleanQuery();
var wildcardQuery = new WildcardQuery(new Term("Name", "chicago hei*"));
luceneQuery.Add(wildcardQuery, BooleanClause.Occur.MUST);

我没有得到任何结果.将不胜感激.

I'm not getting any results. Would appreciate any advice.

推荐答案

为此,您需要使用Field.Index.NOT_ANALYZED设置来索引字段,该设置与您使用的UN_TOKENIZED相同,因此应该可以使用.这是我很快准备测试的工作样本.我正在使用Nuget上可用的最新版本

To do that you need to index your field with the Field.Index.NOT_ANALYZED setting, which is the same as the UN_TOKENIZED you use, so it should work. Heres a working sample I quickly made up to test. Im using the latest version available on Nuget

IndexWriter iw = new IndexWriter(@"C:\temp\sotests", new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29), true);

Document doc = new Document();
Field loc = new Field("location", "", Field.Store.YES, Field.Index.NOT_ANALYZED);
doc.Add(loc);

loc.SetValue("chicago heights");
iw.AddDocument(doc);

loc.SetValue("new-york");
iw.AddDocument(doc);

loc.SetValue("chicago low");
iw.AddDocument(doc);

loc.SetValue("montreal");
iw.AddDocument(doc);

loc.SetValue("paris");
iw.AddDocument(doc);

iw.Commit();


IndexSearcher ins = new IndexSearcher(iw.GetReader());

WildcardQuery query = new WildcardQuery(new Term("location", "chicago he*"));

var hits = ins.Search(query);

for (int i = 0; i < hits.Length(); i++)
    Console.WriteLine(hits.Doc(i).GetField("location").StringValue());

Console.WriteLine("---");

query = new WildcardQuery(new Term("location", "chic*"));
hits = ins.Search(query);

for (int i = 0; i < hits.Length(); i++)
    Console.WriteLine(hits.Doc(i).GetField("location").StringValue());

iw.Close();
Console.ReadLine();

这篇关于Lucene.net和部分“开始于"词组搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆