如何获得一个Lucene领域的所有方面在Lucene的4 [英] How to get all terms for a Lucene field in Lucene 4
问题描述
我想更新我的code从Lucene的3.4至4.1。我想通了变化,除了之一。我有code这就需要遍历所有项的值一个字段。在Lucene的3.1有提供TermEnum,我可以遍历一个的IndexReader#条款()方法。这似乎已经改变了Lucene的4.1,甚至几个小时的文档中搜索我无法弄清楚如何了。可有人请点我朝着正确的方向?
感谢。
对于所有谁whant的直接回答。这是有关部分从迁移指南:
您如何获取枚举发生了变化。主入口点是
字段
类。如果你知道你的读者是一个单段的读者,这样做:字段的字段= reader.Fields();
如果(领域!= NULL){
...
}如果读者可能是多段,你必须这样做:
字段的字段= MultiFields.getFields(读卡器);
如果(领域!= NULL){
...
}的
字段
可能是空
(例如,如果读者有没有字段)。
请注意,
MultiFields
方法要求对性能的影响在MultiReaders
,因为它必须合并计算/文档/位置上飞。它通常是更好地得到,而不是连续的读者(使用oal.util.ReaderUtil
),然后通过这些读者步骤自己,如果可以的话(这是Lucene的如何驱动搜索)。
如果您通过一个
SegmentReader
到MultiFields.fields
将简单的返回阅读器。字段()
,所以在这种情况下没有性能损失。
一旦你有一个非空字段,你可以这样做:
条款术语= fields.terms(田);
如果(条件!= NULL){
...
}的
条款
可能是空
(例如,如果字段不存在)。
一旦你有一个非 -
空
条款,你可以得到一个这样的枚举:TermsEnum termsEnum = terms.iterator();
返回
TermsEnum
将不能为null。
您可以再
的.next()
通过TermsEnum
块引用>解决方案请按照 Lucene的4迁移指南
I'm trying to update my code from Lucene 3.4 to 4.1. I figured out the changes except one. I have code which needs to iterate over all term values for one field. In Lucene 3.1 there was an IndexReader#terms() method providing a TermEnum, which I could iterate over. This seems to have changed for Lucene 4.1 and even after several hours of search in the documentation I am not able to figure out how. Can someone please point me in the right direction?
Thanks.
For all who whant the direct answer. This is the relevant part from the migration guide:
How you obtain the enums has changed. The primary entry point is the
Fields
class. If you know your reader is a single segment reader, do this:Fields fields = reader.Fields(); if (fields != null) { ... }
If the reader might be multi-segment, you must do this:
Fields fields = MultiFields.getFields(reader); if (fields != null) { ... }
The
fields
may benull
(eg if the reader has no fields).Note that the
MultiFields
approach entails a performance hit onMultiReaders
, as it must merge terms/docs/positions on the fly. It's generally better to instead get the sequential readers (useoal.util.ReaderUtil
) and then step through those readers yourself, if you can (this is how Lucene drives searches).If you pass a
SegmentReader
toMultiFields.fields
it will simply returnreader.fields()
, so there is no performance hit in that case.Once you have a non-null Fields you can do this:
Terms terms = fields.terms("field"); if (terms != null) { ... }
The
terms
may benull
(eg if the field does not exist).Once you have a non-
null
terms you can get an enum like this:TermsEnum termsEnum = terms.iterator();
The returned
TermsEnum
will not be null.You can then
.next()
through theTermsEnum
解决方案Please follow Lucene 4 Migration guide
这篇关于如何获得一个Lucene领域的所有方面在Lucene的4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!