如何通过在索引时间将数据拆分为多个字段来为分层构面创建Solr模式 [英] How to create Solr schema for hierarchical facet by splitting data into multiple fields at index time

查看:87
本文介绍了如何通过在索引时间将数据拆分为多个字段来为分层构面创建Solr模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为我的应用程序实现Solr层次构面,其中Category和SubCategory之间有2级层次结构.我想使用 http://wiki.apache.org/solr/HierarchicalFaceting#Pivot_Facets提及的解决方案链接.

I want to implement Solr hierarchical facet for my application where there is 2 level hierarchy between Category and SubCategory. I want to use a solution mentioned on http://wiki.apache.org/solr/HierarchicalFaceting#Pivot_Facets link.

展平的数据如下:

Doc#1: NonFic > Law
Doc#2: NonFic > Sci
Doc#3: NonFic > Sci > Phys

此数据应在索引时针对层次结构的每个级别分为一个单独的字段.和下面一样.

And this data should be split into a separate field for each level of the hierarchy at index time. Same as below.

索引条款

Doc#1: category_level0: NonFic; category_level1: Law
Doc#2: category_level0: NonFic; category_level1: Sci
Doc#3: category_level0: NonFic; category_level1: Sci, category_level2:Phys

那么任何人都可以建议实现此方法的方法吗?如何定义Solr模式以实现此目的?如上所述,在索引时间,我找不到拆分数据的任何参考.

So can anyone please suggest ways to implement this? How do I define Solr schema to achieve this? I could not find any reference for splitting data as mentioned above at Index time.

谢谢

Priyanka

推荐答案

是否需要将这些单独的字段显示为返回文档的一部分?在这种情况下,您需要在字段的存储"版本中使用这些拆分值.如果只需要在搜索过程中或在构面过程中使用它们,则可以忽略存储的"表单,而专注于索引的"表单.

Do you need to display those individual fields as part of the documents returned? In which case you need those split values in 'stored' version of the field. If you only need to have them during search or during faceting, you can ignore the 'stored' form and concentrate on 'indexed' form.

在任何一种情况下,如果您需要将一个字段拆分为多个字段,都可以使用copyField或UpdateRequestProcessor来完成.

In either case, if you need to split one field into several, you can do that with copyField or with UpdateRequestProcessor.

使用copyField时,所有字段的存储"形式都是相同的,但是每个字段可以具有不同的处理器,从而为索引"部分选择层次结构的不同部分.

With copyField, the 'stored' form will be the same for all fields, but you can have different processors for each field, picking different part of the hierarchy for the 'indexed' part.

使用UpdateRequestProcessor,您可以编写一个自定义字段,该字段取一个字段,然后吐出几个字段,每个字段仅包含路径的一部分.您可以进行自定义,也可以进行几个字段复制,然后在每个字段上使用不同的Regex处理器.

With UpdateRequestProcessor, you can write a custom one that takes one field and then spits out several fields, each with only its part of the path. You can do a custom one or do a couple of field copies and then different Regex processor on each field.

这篇关于如何通过在索引时间将数据拆分为多个字段来为分层构面创建Solr模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆