是否可以向Lucene字段添加自定义元数据? [英] Is it possible to add custom metadata to a Lucene field?

查看:81
本文介绍了是否可以向Lucene字段添加自定义元数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经到了需要在Lucene.Net索引中存储一些有关特定字段来自何处的附加数据的地步.具体来说,我想在将字段添加到文档中时将其附加到文档的某些字段,并在从搜索结果中获取文档时再次对其进行检索.

I've come to the point where I need to store some additional data about where a particular field comes from in my Lucene.Net index. Specifically, I want to attach a guid to certain fields of a document when the field is added to the document, and retrieve it again when I get the document from a search result.

这可能吗?

修改: 好的,我举个例子澄清一下.

Okay, let me clarify a bit by giving an example.

假设我有一个对象,我希望允许用户使用自定义标签(例如个人",收藏夹",某些项目")进行标记.为此,我向文档中添加了多个标签"字段,如下所示:

Let's say I have an object that I want to allow the user to tag with custom tags like "personal", "favorite", "some-project". I do this by adding multiple "tag" fields to the document, like so:

doc.Add( new Field( "tag", "personal" ) );
doc.Add( new Field( "tag", "favorite" ) );

问题是我现在需要记录有关每个标签本身的一些元数据,特别是一个表示该标签来自何处的GUID(将其想象为用户ID).每个标签可能具有不同的guid,所以我不能简单地创建一个"tag-guid"字段(除非保留值的顺序-参见下面的编辑2).我不需要将此元数据编入索引(事实上,我不希望它不被编入索引,以避免受到元数据的影响),我只需要能够再次从文档/字段中检索它即可.

The problem is I now need to record some meta data about each individual tag itself, specifically a guid representing where that tag came from (imagine it as a user id). Each tag could potentially have a different guid, so I can't simply create a "tag-guid" field (unless the order of the values is preserved---see edit 2 below). I don't need this metadata to be indexed (and in fact I'd prefer it not to be, to avoid getting hits on metadata), I just need to be able to retrieve it again from the document/field.

doc.GetFields( "tag" )[0].Metadata...

(我在这里构成语法,但我希望我的观点现在很清楚.)

(I'm making up syntax here, but I hope my point is clear now.)

修改2: 由于这是一个完全不同的问题,因此我为此方法发布了一个新问题:好吧,让我们尝试另一种方法...关键问题区域是在相同字段名称(例如标签")下的多个字段值的不确定性.如果我可以在此处引入或获得某种确定性,则可以将元数据存储在另一个字段中.

Okay let's try another approach... The key problem area is the indeterminacy of the multiple field values under the same field name (e.g. "tag"). If I could introduce or obtain some kind of determinacy here, I might be able to store the metadata in another field.

例如,如果我可以依赖于字段值从不改变的顺序,则可以在值集中使用索引来准确标识我要引用的标签.

For example, if I could rely on the order of the values of the field never changing, I could use an index in the set of values to identify exactly which tag I am referring to.

是否可以保证以后再检索文档时,将值添加到字段的顺序保持不变?

Is there any guarantee that the order I add the values to a field will remain the same when I retrieve the document at a later time?

推荐答案

这取决于您对此索引的搜索要求.这样,您可以控制字段的顺序.当然,随着标签列表的更改,这两个字段都需要更新,但是这样做的开销可能是值得的.

Depending on your search requirements for this index, this may be possible. That way you can control the order of fields. It would require updating both fields as the tag list changes of course, but the overhead may be worth it.

doc.Add(new Field("tags", "{personal}|{favorite}")); 
doc.Add(new Field("tagsref", "{1234}|{12345}")); 

注意:使用{}可让您在存在相似值的地方限定搜索的唯一性.

Note: using the {} allows you to qualify your search for uniqueness where similar values exist.

示例:如果将值存储为"person | personal | personage",则搜索"person"将返回一个包含person,person或person的任何人的文档.通过在大括号中进行如下限定:"{person} | {personal} | {personage}",我可以搜索"{person}",并确保它不会返回误报.当然,这是假设您在值中未使用大括号.

Example: If values were stored as "person|personal|personage" searching for "person" would return a document that has any one of person, personal or personage. By qualifying in curly brackets like so: "{person}|{personal}|{personage}", I can search for "{person}" and be sure it won't return false positives. Of course, this assumes you don't use curly brackets in your values.

这篇关于是否可以向Lucene字段添加自定义元数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆