在 Solr 数据导入处理程序中定义嵌套实体 [英] Defining nested entities in Solr Data Import Handler

查看:17
本文介绍了在 Solr 数据导入处理程序中定义嵌套实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先让我提到我已经完成了我能找到的关于这个主题的所有内容,包括 Solr 文档和所有 SO 问题.

Let me preface by mentioning that I've been through everything I could find about this topic including the Solr docs and all of the SO questions.

我有一个 Solr 实例,我已经使用数据导入处理程序设置了该实例,以使用 JDBC 驱动程序从 MSSQL 中提取数据.数据进来了,但它的结构并不像我期望的基于 Solr DIH 文档

I have a Solr instance that I've setup with a Data Import Hanlder to pull in data from MSSQL using the JDBC driver. The data comes in, but it isn't structured as I'd expect based on the Solr DIH documentation

<document>
 <entity>
  <entity />
 </entity>
</document>

我已经尝试了所有属性,例如 rootEntity、flatten、使用 CachedSqlProvider 等.使用 multiValued="True" 结果最终

I've tried all the attributes like rootEntity, flatten, using CachedSqlProvider, etc. With multiValued="True" The result ends up

docs [
{
  recordId: '1234',
  name: 'whatever'
  subrows_col1: ['x','y','z']
  subrows_col2: ['a','b','c']
}
]

当我在寻找

docs [
{
  recordId: '1234',
  name: 'whatever'
  subrows: [{
     col1: 'x',
     col2: 'a'
 },
  {
     col1: 'y',
     col2: 'b'
 },
 {
     col1: 'z',
     col2: 'c'
 }]
} ]

我看过块连接的东西,但我对它的去向感到困惑.我添加了

I've seen the block-join stuff, but I'm confused as to where it goes. I added

<add>
 <doc>
  <field />
  <doc>
   <field />
  </doc>
 <doc>
</add>

到 DIH requestHandler,但它什么也没做.我将它添加到/update requestHandler 并且出现错误.我不知道应该去哪里.它仅在查询期间有效还是仅在您通过/update 将数据推送到 solr 时有效?

to the DIH requestHandler, but it did nothing. I added it to the /update requestHandler and I got an error. I have no clue where that is supposed to go. Does it only work during a query or is it only for when you push data to solr via /update?

在哪里定义文档的结构?我尝试了架构中的嵌套字段、DIH 配置中的实体以及 requestHandlers 中的块连接内容.还没有任何效果.

Where do I define the structure for the document? I tried nested fields in the schema, entities in the DIH config and the block-join stuff in the requestHandlers. nothing has worked yet.

显然我错过了一些东西.

Obviously I'm missing something.

推荐答案

DIH 不生成嵌套文档.Solr 支持它们,但 DIH 还不能生成它们.

DIH does not produce nested documents. Solr supports them, but DIH can't yet generate them.

DIH 中的嵌套实体是为了能够合并源并能够基于来自不同源的迭代创建实体.例如.如果外部实体读取文件以获取文件名,内部实体从这些文件中加载内容,每个文件都有自己的记录.

The nested entities in DIH is to be able to merge sources and to be able to create entities based on iteration from a different source. E.g. if the outer entity reads a file for file names and inner entity loads content from those files with each file getting its own record.

您现在可能希望使用 SolrJ 将嵌套对象代码移动到客户端中.

You may want to move your nested object code into the client with SolrJ for now.

这篇关于在 Solr 数据导入处理程序中定义嵌套实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆