检索Lucene搜索返回的所有文档的某些字段的最佳方法 [英] Best way to retrieve certain field of all documents returned by a Lucene search

查看:288
本文介绍了检索Lucene搜索返回的所有文档的某些字段的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道最好的方法是检索Lucene的搜索者返回的所有文档的某个字段.

背景:每个文档都有一个日期字段(写在上面),我想显示所有找到的文档的时间轴,因此我需要提取通过搜索找到的所有文档的日期(天)字段. /p>

我目前使用Searcher.doc(int,FieldSelector)检索每个文档,而选择器仅检索特定字段.

我已经索引了25万个文档,搜索本身不花时间,并且返回了约1万个文档ID.

但是,检索这些内容需要20秒钟以上的时间.

我可以做些什么来加快速度,但仍然获得我需要的所有值?

解决方案

检索字段值的更好方法是使用FieldCache.例如,如果字段值为字符串,则可以按以下方式检索值.

String[] fieldValues = FieldCache.DEFAULT.getStrings(indexReader, "FieldName")

顾名思义,这些值被缓存.那就是随后的通话没有时间了.现在,您可以使用lucene文档ID查找此数组,以检索给定文档的该字段的值.

I was wondering what the best way is to retrieve a certain field of all documents returned by a Searcher of Lucene.

Background: each document has a date field (written on) and I would like to show a timeline of all found documents, so I need to extract the date (day) field of all the documents I find with the search.

I currently retrieve every document using Searcher.doc(int, FieldSelector) having the selector only retrieve the certain field.

I have indexed 250k documents, the search itself takes no time and returns about 10k document ids.

Retrieving those however, takes 20+ seconds.

What can I do to speed things up, but still get all the values I need?

解决方案

A better way to retrieve field values is with FieldCache.For example, if the field value is string, you can retrieve values as follows.

String[] fieldValues = FieldCache.DEFAULT.getStrings(indexReader, "FieldName")

As the name suggests, these values are cached. That is subsequent calls take no time. You can now look up this array with lucene document id to retrieve value of that field for the given document.

这篇关于检索Lucene搜索返回的所有文档的某些字段的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆