Lucene 索引:存储和索引模式解释 [英] Lucene indexing: Store and indexing modes explained

查看:19
本文介绍了Lucene 索引:存储和索引模式解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我仍然不了解 lucene 索引选项.

I think I'm still not understanding the lucene indexing options.

以下选项是

  • Store.Yes
  • Store.No

  • Index.Tokenized
  • Index.Un_Tokenized
  • Index.No
  • Index.No_Norms

我不太了解商店选项.为什么您不想存储您的字段?
标记化是拆分内容并删除干扰词/分隔符(如and"、or"等)
我不知道规范可能是什么.如何存储标记化的值?
如果我在fieldName"中存储一个值我的字符串"会发生什么?为什么没有查询

I don't really understand the store option. Why would you ever want to NOT store your field?
Tokenizing is splitting up the content and removing the noise words/separators (like "and", "or" etc)
I don't have a clue what norms could be. How are tokenized values stored?
What happens if i store a value "my string" in "fieldName"? Why doesn't a query

fieldName:my string

退货?

推荐答案

Store.Yes

表示该字段的值将存储在索引中

Store.Yes

Means that the value of the field will be stored in the index

表示字段的值将不会存储在索引中

Means that the value of the field will NOT be stored in the index

Store.Yes/No 不影响 lucene 的索引或搜索.它只是告诉 lucene 您是否希望它充当字段中值的数据存储.如果您使用 Store.Yes,那么当您搜索时,该字段的值将包含在您的搜索结果 Documents 中.

Store.Yes/No does not affect the indexing or searching with lucene. It just tells lucene if you want it to act as a datastore for the values in the field. If you use Store.Yes, then when you search, the value of that field will be included in your search result Documents.

如果您将数据存储在数据库中并且仅使用 Lucene 索引进行搜索,那么您可以在所有字段中使用 Store.No.但是,如果您也将索引用作存储,那么您将需要 Store.Yes.

If you're storing your data in a database and only using the Lucene index for searching, then you can get away with Store.No on all of your fields. However, if you're using the index as storage as well, then you'll want Store.Yes.

表示该字段在被索引时将被标记(你得到了那个).这对于包含多个单词的长字段很有用.

Means that the field will be tokenized when it's indexed (you got that one). This is useful for long fields with multiple words.

表示该字段将不被分析,并将作为单个值存储.这对于关键字/单字和一些短的多字字段很有用.

Means that the field will not be analyzed and will be stored as a single value. This is useful for keyword/single-word and some short multi-word fields.

正是它所说的.该字段不会被索引,因此无法搜索.但是,您可以将 Index.No 与 Store.Yes 一起使用来存储您不希望可搜索的值.

Exactly what it says. The field will not be indexed and therefore unsearchable. However, you can use Index.No along with Store.Yes to store a value that you don't want to be searchable.

与 Index.Un_Tokenized 相同,只是不存储一些规范化数据会节省一些字节.该数据用于提升和字段长度标准化.

Same as Index.Un_Tokenized except for that a few bytes will be saved by not storing some Normalization data. This data is what is used for boosting and field-length normalization.

为了进一步阅读,lucene javadocs 是无价的(当前 API 版本 4.4.0):

For further reading, the lucene javadocs are priceless (current API version 4.4.0):

对于您的最后一个问题,关于为什么您的查询没有返回任何内容,而不知道您如何索引该字段,我会说这是因为您的 fieldName 限定符仅附加到我的"字符串.要搜索您想要的短语我的字符串":

For your last question, about why your query's not returning anything, without knowing anymore about how you're indexing that field, I'd say that it's because your fieldName qualifier is only attached to the 'my' string. To do the search for the phrase "my string" you want:

fieldName:"我的字符串"

fieldName:"my string"

在 fieldName 字段中同时搜索单词my"和string":

A search for both the words "my" and "string" in the fieldName field:

字段名:(我的字符串)

fieldName:(my string)

这篇关于Lucene 索引:存储和索引模式解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆