如何获得一个Lucene领域的所有方面在Lucene的4 [英] How to get all terms for a Lucene field in Lucene 4

查看:223
本文介绍了如何获得一个Lucene领域的所有方面在Lucene的4的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想更新我的code从Lucene的3.4至4.1。我想通了变化,除了之一。我有code这就需要遍历所有项的值一个字段。在Lucene的3.1有提供TermEnum,我可以遍历一个的IndexReader#条款()方法。这似乎已经改变了Lucene的4.1,甚至几个小时的文档中搜索我无法弄清楚如何了。可有人请点我朝着正确的方向?

感谢。

对于所有谁whant的直接回答。这是有关部分从迁移指南:


  

您如何获取枚举发生了变化。主入口点是字段类。如果你知道你的读者是一个单段的读者,这样做:

 字段的字段= reader.Fields();
如果(领域!= NULL){
  ...
}

如果读者可能是多段,你必须这样做:

 字段的字段= MultiFields.getFields(读卡器);
如果(领域!= NULL){
  ...
}

字段可能是(例如,如果读者有没有字段)。


  
  

请注意, MultiFields 方法要求对性能的影响在 MultiReaders ,因为它必须合并计算/文档/位置上飞。它通常是更好地得到,而不是连续的读者(使用 oal.util.ReaderUtil ),然后通过这些读者步骤自己,如果可以的话(这是Lucene的如何驱动搜索)。


  
  

如果您通过一个 SegmentReader MultiFields.fields 将简单的返回阅读器。字段(),所以在这种情况下没有性能损失。


  
  

一旦你有一个非空字段,你可以这样做:

 条款术语= fields.terms(田);
如果(条件!= NULL){
  ...
}

条款可能是(例如,如果字段不存在)。


  
  

一旦你有一个非 - 条款,你可以得到一个这样的枚举:

  TermsEnum termsEnum = terms.iterator();


  
  

返回 TermsEnum 将不能为null。


  
  

您可以再的.next()通过 TermsEnum



解决方案

请按照 Lucene的4迁移指南

I'm trying to update my code from Lucene 3.4 to 4.1. I figured out the changes except one. I have code which needs to iterate over all term values for one field. In Lucene 3.1 there was an IndexReader#terms() method providing a TermEnum, which I could iterate over. This seems to have changed for Lucene 4.1 and even after several hours of search in the documentation I am not able to figure out how. Can someone please point me in the right direction?

Thanks.

For all who whant the direct answer. This is the relevant part from the migration guide:

How you obtain the enums has changed. The primary entry point is the Fields class. If you know your reader is a single segment reader, do this:

Fields fields = reader.Fields();
if (fields != null) {
  ...
}

If the reader might be multi-segment, you must do this:

Fields fields = MultiFields.getFields(reader);
if (fields != null) {
  ...
}

The fields may be null (eg if the reader has no fields).

Note that the MultiFields approach entails a performance hit on MultiReaders, as it must merge terms/docs/positions on the fly. It's generally better to instead get the sequential readers (use oal.util.ReaderUtil) and then step through those readers yourself, if you can (this is how Lucene drives searches).

If you pass a SegmentReader to MultiFields.fields it will simply return reader.fields(), so there is no performance hit in that case.

Once you have a non-null Fields you can do this:

Terms terms = fields.terms("field");
if (terms != null) {
  ...
}

The terms may be null (eg if the field does not exist).

Once you have a non-null terms you can get an enum like this:

TermsEnum termsEnum = terms.iterator();

The returned TermsEnum will not be null.

You can then .next() through the TermsEnum

解决方案

Please follow Lucene 4 Migration guide

这篇关于如何获得一个Lucene领域的所有方面在Lucene的4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆