Solr关联 [英] Solr associations

查看:75
本文介绍了Solr关联的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近几天,我们正在考虑将Solr用作我们的首选搜索引擎. 我们需要的大多数功能都是开箱即用的,或者可以轻松配置. 但是,我们绝对需要有一个功能似乎在Solr中很好地隐藏(或缺失).

The last couple of days we are thinking of using Solr as our search engine of choice. Most of the features we need are out of the box or can be easily configured. There is however one feature that we absolutely need that seems to be well hidden (or missing) in Solr.

我将尝试举例说明.我们有很多实际上是业务的文档:

I'll try to explain with an example. We have lots of documents that are actually businesses:

<document>
  <name>Apache</name>
  <cat>1</cat>
  ...
</document>
<document>
  <name>McDonalds</name>
  <cat>2</cat>
  ...
</document>

此外,我们还有另一个具有所有类别和同义词的xml文件:

In addition we have another xml file with all the categories and synonyms:

<cat id=1>
  <name>software</name>
  <synonym>IT<synonym>
</cat>
<cat id=2>
  <name>fast food</name>
  <synonym>restaurant<synonym>
</cat>

我们要同时关联业务和类别,以便我们可以使用类别的名称和/或同义词进行搜索.但是我们不想在建立索引时合并这些文件,因为我们应该更新类别(adding.remioving同义词...),而不必再次为所有业务建立索引.

We want to associate both businesses and categories so we can search using the name and/or synonyms of the category. But we do not want to merge these files at indexing time because we should update the categories (adding.remioving synonyms...) without indexing all the businesses again.

Solr中有没有做这种关联的事情,还是我们需要开发一些特定的作品?

Is there anything in Solr that does this kind of associations or do we need to develop some specific pieces?

欢迎所有反馈和建议.

预先感谢, 汤姆

推荐答案

基本上,您在这里有一个设计决策.人们通常使用Solr索引对它们进行规范化,即将类别定义分解到业务文档中. 因为您不想这样做,所以我建议保留两种类型的文档-一种用于业务,另一种用于类别.您可以将两种文档都保留在同一索引中,因为Solr不需要所有文档都具有相同的字段.商业文档看似简单明了,但是您必须使它们可以通过商业名称和类别ID进行搜索.我建议为每个同义词创建一个类别文档,在其中您可以按同义词进行搜索并找到ID(和类别名称).

Basically you have a design decision here. The usual thing people do with Solr indexes is to denormalize them, i.e. explode the category definition into the business' document. As you do not want to do this, I suggest keeping two types of documents - one for the businesses and another for the categories.You can keep both in the same index, as Solr does not require all documents to have the same fields. The business documents seem straightforward, but you have to make them searchable by both the business name and the category id. I suggest creating a category document for each synonym, where you search by synonym and find the id (and category name).

要使用同义词进行搜索,您将需要进行两次搜索-

To search using synonyms, you will need a double search -

  • 使用名称的文本搜索类别ID.
  • 使用类别ID搜索商家.

这篇关于Solr关联的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆