Lucene.Net:如何在搜索结果中添加日期过滤器? [英] Lucene.Net: How can I add a date filter to my search results?

查看:81
本文介绍了Lucene.Net:如何在搜索结果中添加日期过滤器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的搜索器运行得非常好,但是它确实会返回过时的结果.我的网站就像NerdDinner一样,过去的事件变得无关紧要.

I've got my searcher working really well, however it does tend to return results that are obsolete. My site is much like NerdDinner whereby events in the past become irrelevant.

我目前正在像这样索引
注意:我的示例在VB.NET中,但是我不在乎示例是否在C#中给出

I'm currently indexing like this
note: my example is in VB.NET, but I don't care if examples are given in C#

    Public Function AddIndex(ByVal searchableEvent As [Event]) As Boolean Implements ILuceneService.AddIndex

        Dim writer As New IndexWriter(luceneDirectory, New StandardAnalyzer(), False)

        Dim doc As Document = New Document

        doc.Add(New Field("id", searchableEvent.ID, Field.Store.YES, Field.Index.UN_TOKENIZED))
        doc.Add(New Field("fullText", FullTextBuilder(searchableEvent), Field.Store.YES, Field.Index.TOKENIZED))
        doc.Add(New Field("user", If(searchableEvent.User.UserName = Nothing,
                                     "User" & searchableEvent.User.ID,
                                     searchableEvent.User.UserName),
                                 Field.Store.YES,
                                 Field.Index.TOKENIZED))
        doc.Add(New Field("title", searchableEvent.Title, Field.Store.YES, Field.Index.TOKENIZED))
        doc.Add(New Field("location", searchableEvent.Location.Name, Field.Store.YES, Field.Index.TOKENIZED))
        doc.Add(New Field("date", searchableEvent.EventDate, Field.Store.YES, Field.Index.UN_TOKENIZED))

        writer.AddDocument(doc)

        writer.Optimize()
        writer.Close()
        Return True

    End Function

请注意,我如何使用一个日期"索引来存储事件日期.

Notice how I have a "date" index that stores the event date.

然后我的搜索如下

''# code omitted
        Dim reader As IndexReader = IndexReader.Open(luceneDirectory)
        Dim searcher As IndexSearcher = New IndexSearcher(reader)
        Dim parser As QueryParser = New QueryParser("fullText", New StandardAnalyzer())
        Dim query As Query = parser.Parse(q.ToLower)

        ''# We're using 10,000 as the maximum number of results to return
        ''# because I have a feeling that we'll never reach that full amount
        ''# anyways.  And if we do, who in their right mind is going to page
        ''# through all of the results?
        Dim topDocs As TopDocs = searcher.Search(query, Nothing, 10000)
        Dim doc As Document = Nothing

        ''# loop through the topDocs and grab the appropriate 10 results based
        ''# on the submitted page number
        While i <= last AndAlso i < topDocs.totalHits
                doc = searcher.Doc(topDocs.scoreDocs(i).doc)
                IDList.Add(doc.[Get]("id"))
                i += 1
        End While
''# code omitted

我确实尝试了以下操作,但无济于事(引发了NullReferenceException).

I did try the following, but it was to no avail (threw a NullReferenceException).

        While i <= last AndAlso i < topDocs.totalHits
            If Date.Parse(doc.[Get]("date")) >= Date.Today Then
                doc = searcher.Doc(topDocs.scoreDocs(i).doc)
                IDList.Add(doc.[Get]("id"))
                i += 1
            End If
        End While

我还找到了以下文档,但无法制作正面或反面
http://lucene.apache.org/java /1_4_3/api/org/apache/lucene/search/DateFilter.html

I also found the following documentation, but I can't make heads or tails of it
http://lucene.apache.org/java/1_4_3/api/org/apache/lucene/search/DateFilter.html

推荐答案

您正在链接到Lucene 1.4.3的api文档. Lucene.Net当前为2.9.2.我认为应该进行升级.

You're linking to the api documentation of Lucene 1.4.3. Lucene.Net is currently at 2.9.2. I think an upgrade is due.

首先,您正在使用Store.Yes.存储的字段会使索引变大,这可能是性能问题.通过将日期存储为字符串,格式为"yyyyMMddHHmmssfff"(这实际上是高分辨率,可低至毫秒),可以轻松解决日期问题.您可能需要降低分辨率以创建更少的令牌来减小索引大小.

First, you're using Store.Yes alot. Stored fields will make your index larger, which may be a performance issue. Your date problem can easily be solved by storing dates as strings in the format of "yyyyMMddHHmmssfff" (that's really high resolution, down to milliseconds). You may want to reduce the resolution to create fewer tokens to reduce your index size.

var dateValue = DateTools.DateToString(searchableEvent.EventDate, DateTools.Resolution.MILLISECOND);
doc.Add(new Field("date", dateValue, Field.Store.YES, Field.Index.NOT_ANALYZED));

然后,将过滤器应用于搜索(第二个参数,您当前在其中以Nothing/null传递).

Then you apply a filter to your search (the second parameter, where you currently pass in Nothing/null).

var dateValue = DateTools.DateToString(DateTime.Now, DateTools.Resolution.MILLISECOND);
var filter = FieldCacheRangeFilter.NewStringRange("date", 
                 lowerVal: dateValue, includeLower: true, 
                 upperVal: null, includeUpper: false);
var topDocs = searcher.Search(query, filter, 10000);

您可以使用将常规查询与RangeQuery结合使用的BooleanQuery来执行此操作,但这也会影响得分(该得分是根据查询而不是过滤器计算的).为了简单起见,您可能还希望避免修改查询,因此您知道要执行什么查询.

You can do this using a BooleanQuery combining your normal query with a RangeQuery, but that would also affect scoring (which is calculated on the query, not the filter). You may also want to avoid modifying the query for simplicity, so you know what query is executed.

这篇关于Lucene.Net:如何在搜索结果中添加日期过滤器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆