Lucene的索引:存储和索引模式解释 [英] Lucene indexing: Store and indexing modes explained

查看:372
本文介绍了Lucene的索引:存储和索引模式解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我还是不理解了Lucene索引选项。

I think I'm still not understanding the lucene indexing options.

以下选项

  • Store.Yes
  • Store.No

  • Index.Tokenized
  • Index.Un_Tokenized
  • Index.No
  • Index.No_Norms

我真的不明白的存储选项。为什么你想不存储你的领域?
标记化被分割了的内容并去除噪声字/分离器(如与,或等)的
我没有什么线索的规范可能。 如何标记过的值存储在哪里?
如果我储存的值我的字符串,在字段名会发生什么? 为什么没有一个查询

I don't really understand the store option. Why would you ever want to NOT store your field?
Tokenizing is splitting up the content and removing the noise words/separators (like "and", "or" etc)
I don't have a clue what norms could be. How are tokenized values stored?
What happens if i store a value "my string" in "fieldName"? Why doesn't a query

字段名:我的字符串

返回什么?

推荐答案

表示该字段的值将被存储在索引

Store.Yes

Means that the value of the field will be stored in the index

表示该字段的值的不是的存储在索引

Means that the value of the field will NOT be stored in the index

Store.Yes /否不影响分度或lucene的搜索。它只是告诉lucene的,如果你想让它作为数据存储在字段中的值。如果你使用Store.Yes,那么当你搜索时,该字段的值将包含在搜索结果的文件。

Store.Yes/No does not affect the indexing or searching with lucene. It just tells lucene if you want it to act as a datastore for the values in the field. If you use Store.Yes, then when you search, the value of that field will be included in your search result Documents.

如果您在数据库中存储数据,并仅使用Lucene索引搜索,那么你就可以逃脱Store.No在所有的领域。但是,如果你使用的索引存储为好,那么你会希望Store.Yes。

If you're storing your data in a database and only using the Lucene index for searching, then you can get away with Store.No on all of your fields. However, if you're using the index as storage as well, then you'll want Store.Yes.

意味着当它索引的字段将被标记化(你有那一个)。这是长字段多的话是有用的。

Means that the field will be tokenized when it's indexed (you got that one). This is useful for long fields with multiple words.

表示该字段将不被分析,将被存储为单个值。这是关键字/单字的一些短多字的领域。

Means that the field will not be analyzed and will be stored as a single value. This is useful for keyword/single-word and some short multi-word fields.

正是它说。本场将不会被索引,因此不可搜索。但是,你的可以的使用Index.No随着Store.Yes存储您不希望被搜索的值。

Exactly what it says. The field will not be indexed and therefore unsearchable. However, you can use Index.No along with Store.Yes to store a value that you don't want to be searchable.

同Index.Un_Tokenized除了那几个字节将被保存通过不存储一些规范化的数据。这个数据是什么,是用于促进和外地长度正常化。

Same as Index.Un_Tokenized except for that a few bytes will be saved by not storing some Normalization data. This data is what is used for boosting and field-length normalization.

有关进一步阅读,Lucene的的javadoc是无价的(当前API版本4.4.0):

For further reading, the lucene javadocs are priceless (current API version 4.4.0):

  • <一个href="http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/document/Field.Index.html">Field.Index
  • <一个href="http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/document/Field.Store.html">Field.Store
  • Field.Index
  • Field.Store

有关你的最后一个问题,为什么您查询的不返回任何东西,不知道再如何你索引那场,我会说,那是因为你的字段名资格赛只附着在我的字符串。要做到搜索短语我的字符串你想要的:

For your last question, about why your query's not returning anything, without knowing anymore about how you're indexing that field, I'd say that it's because your fieldName qualifier is only attached to the 'my' string. To do the search for the phrase "my string" you want:

字段名:我的字符串

一个搜索无论是在字段名字段中的我和串:

A search for both the words "my" and "string" in the fieldName field:

字段名:(我的字符串)

fieldName:(my string)

这篇关于Lucene的索引:存储和索引模式解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆