在solr中保留多值关联 [英] preserve association in multivalued in solr

查看:125
本文介绍了在solr中保留多值关联的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的solr数据源中有多值字段.样品是

I have multivalued fields in my solr datasource. sample is


<doc>
<str name="id">23606</str>
<arr name="institution">
    <str>Harvard University</str>
    <str>Yale Universety</str>
    <str>Cornell University</str>
    <str>TUFTS University</str>
    <str>University of Arizona</str>
</arr>
<arr name="degree_level">
    <str>Bachelors</str>
    <str>Diploma</str>
    <str>Master</str>
    <str>Master</str>
    <str>PhD</str>
</arr>
</doc>

在上面的示例中,该用户获得了哈佛大学的学士学位,耶鲁大学的文凭,康奈尔大学的硕士学位,TUFTS的硕士学位以及亚利桑那州的博士学位. 现在,如果我搜索具有学士学位并毕业于哈佛的用户,我将获得此用户,这是正确的. MyDomain:8888/solr/mycol/select?facet=true&q=:&fq=degree_level:Bachelors&fq=institution:Harvard+University

in the example above this user has got Bachelors degree from Harvard, Diploma from Yale, Master from Cornell, Master from TUFTS, and PhD from Arizona. now if i search for users who have Bachelors degree and graduated from Harvard, i will get this user, which is correct. MyDomain:8888/solr/mycol/select?facet=true&q=:&fq=degree_level:Bachelors&fq=institution:Harvard+University

但是,如果我想要拥有康奈尔大学学士学位的人,我也会得到该用户,这是不正确的! MyDomain:8888/solr/mycol/select?facet=true&q=:&fq=degree_level:Bachelors&fq=institution:Cornell+University
问题是:如何在solr中保留多值排序/映射?

顺便说一句,我知道我可以通过创建新字段来包含大学学位(例如,学士_哈佛大学",文凭_耶鲁大学"等等)的衔接来解决我的问题,但是我需要一个基于Solr核心的解决方案本身,因为我有很多带有多个组合的多值字段.

but if i want those who have Bachelors from Cornell, i will get this user as well, which is incorrect! MyDomain:8888/solr/mycol/select?facet=true&q=:&fq=degree_level:Bachelors&fq=institution:Cornell+University
The question is: how could i preserve ordering/mapping in multivalued in solr?

By the way, i know that i can solve my problem by creating new field to contain concatenation of the degree with university (ie, "Bachelors_Harvard University", "Diploma_Yale Universety", and so on) but i need a solution based on solr core itself as i have a lot of multivalued fields with a lot of combinations.

推荐答案

下面是一些建议列表

  • 尝试使用动态字段
    < dynamicField name ="degree_level_ *" type ="string" indexed ="true"存储="true"/>
    并在使用Harward University等价值对degree_level_Bachelors进行索引时动态创建字段.因此,当您要根据学士学位过滤时,请根据字段degree_level_Bachelors进行过滤.同样,如果要允许对机构进行过滤,请为机构创建一个动态字段.
  • 您可以预先定义如何存储数据: <年份><分隔符><程度><分隔符><机构><分隔符><主要>等等等
    然后过滤所需的正则表达式.
    例如:
    fq = education详细信息:2009 @ Bachelors @ Harvard @ *
    这将为您提供2009年来自哈佛的单身汉的所有记录. 您将不得不针对所有不同的过滤器提出正则表达式.
  • 两个集合以正确地建模使用{!join}查询的用户与学位之间的一对多关系
  • 通过"Solr"的现场折叠支持对用户程度"粒度级别的一个集合进行重复数据删除.
  • try using dynamic fields
    <dynamicField name="degree_level_*" type="string" indexed="true" stored="true" />
    and create fields dynamically while indexing degree_level_Bachelors with value Harward University and so on. so when you want to filter on Bachelors degree, filter on field degree_level_Bachelors. Similarly, if you want to allow filtering on institutions, create a dynamic field for institutions.
  • you can pre define how you will be storing data: <year><seperator><degree><seperator><institution><seperator><Major> etc etc.
    and then filter on the reqired regex.
    eg:
    fq=educationDetails:2009@Bachelors@Harvard@*
    this will give you all records with bachelors from Harvard in 2009. you will have to come up with the regex expressions for all the different filters.
  • two collections to correctly model the one-to-many relationship between user and degree queried using {!join}
  • one collection at a "user-degree" level of granularity that gets deduped via Solr's field collapsing support.

这篇关于在solr中保留多值关联的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆