在solr中查询具有不同字段的多个集合 [英] Query multiple collections with different fields in solr

查看:30
本文介绍了在solr中查询具有不同字段的多个集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

鉴于以下(单核)查询:

Given the following (single core) query's:

http://localhost/solr/a/select?indent=true&q=*:*&rows=100&start=0&wt=json
http://localhost/solr/b/select?indent=true&q=*:*&rows=100&start=0&wt=json

第一个查询返回"numFound":40000"第二个查询返回 "numFound":10000"

The first query returns "numFound":40000" The second query returns "numFound":10000"

我尝试通过以下方式将它们组合在一起:

I tried putting these together by:

   http://localhost/solr/a/select?indent=true&shards=localhost/solr/a,localhost/solr/b&q=*:*&rows=100&start=0&wt=json

现在我得到numFound":50000".唯一的问题是a"的列比b"多.所以多个集合请求只返回a的值.

Now I get "numFound":50000". The only problem is "a" has more columns than "b". So the multiple collections request only returns the values of a.

是否可以查询具有不同字段的多个集合?或者它们必须相同?我应该如何更改我的第三个 url 以获得此结果?

Is it possible to query multiple collections with different fields? Or do they have to be the same? And how should I change my third url to get this result?

推荐答案

您需要的是 - 我称之为 - 统一核心.该架构本身将没有内容,它仅用作一种包装器来统一您想要从两个核心显示的那些字段.在那里你需要

What you need is - what I call - a unification core. That schema itself will have no content, it is only used as a sort of wrapper to unify those fields you want to display from both cores. In there you will need

  • 一个schema.xml,它包含了您想要在统一结果中包含的所有字段
  • 为您组合了两个不同核心的查询处理程序

事先从有关 DistributedSearch 的 Solr Wiki 页面

文档必须具有唯一键,并且必须存储唯一键(schema.xml 中的stored="true") 唯一键字段在所有分片中必须是唯一的.如果遇到具有重复唯一键的文档,Solr 将尝试返回有效结果,但该行为可能是不确定的.

Documents must have a unique key and the unique key must be stored (stored="true" in schema.xml) The unique key field must be unique across all shards. If docs with duplicate unique keys are encountered, Solr will make an attempt to return valid results, but the behavior may be non-deterministic.

例如,我有 shard-1 字段 id、title、description 和 shard-2 字段 id、title、abstractText.所以我有这些模式

As example, I have shard-1 with the fields id, title, description and shard-2 with the fields id, title, abstractText. So I have these schemas

shard-1 的架构

<schema name="shard-1" version="1.5">

  <fields>
    <field name="id"
          type="int" indexed="true" stored="true" multiValued="false" />
    <field name="title" 
          type="text" indexed="true" stored="true" multiValued="false" />
    <field name="description"
          type="text" indexed="true" stored="true" multiValued="false" />
  </fields>
  <!-- type definition left out, have a look in github -->
</schema>

shard-2 的架构

<schema name="shard-2" version="1.5">

  <fields>
    <field name="id" 
      type="int" indexed="true" stored="true" multiValued="false" />
    <field name="title" 
      type="text" indexed="true" stored="true" multiValued="false" />
    <field name="abstractText" 
      type="text" indexed="true" stored="true" multiValued="false" />
  </fields>
  <!-- type definition left out, have a look in github -->
</schema>

为了统一这些模式,我创建了第三个模式,我称之为 shard-unification,它包含所有四个字段.

To unify these schemas I create a third schema that I call shard-unification, which contains all four fields.

<schema name="shard-unification" version="1.5">

  <fields>
    <field name="id" 
      type="int" indexed="true" stored="true" multiValued="false" />
    <field name="title" 
      type="text" indexed="true" stored="true" multiValued="false" />
    <field name="abstractText" 
      type="text" indexed="true" stored="true" multiValued="false" />
    <field name="description" 
      type="text" indexed="true" stored="true" multiValued="false" />
  </fields>
  <!-- type definition left out, have a look in github -->
</schema>

现在我需要利用这个组合模式,所以我在 solr-unification 核心的 solrconfig.xml 中创建了一个查询处理程序

Now I need to make use of this combined schema, so I create a query handler in the solrconfig.xml of the solr-unification core

<requestHandler name="standard" class="solr.StandardRequestHandler" default="true">
  <lst name="defaults">
    <str name="defType">edismax</str>
    <str name="q.alt">*:*</str>
    <str name="qf">id title description abstractText</str>
    <str name="fl">*,score</str>
    <str name="mm">100%</str>
  </lst>
</requestHandler>
<queryParser name="edismax" class="org.apache.solr.search.ExtendedDismaxQParserPlugin" />

就是这样.现在 shard-1 和 shard-2 中需要一些索引数据.要查询统一结果,只需使用适当的分片参数查询分片统一即可.

That's it. Now some index-data is required in shard-1 and shard-2. To query for a unified result, just query shard-unification with appropriate shards param.

http://localhost/solr/shard-unification/select?q=*:*&rows=100&start=0&wt=json&shards=localhost/solr/shard-1,localhost/solr/shard-2

这将返回类似的结果

{
  "responseHeader":{
    "status":0,
    "QTime":10},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
      {
        "id":1,
        "title":"title 1",
        "description":"description 1",
        "score":1.0},
      {
        "id":2,
        "title":"title 2",
        "abstractText":"abstract 2",
        "score":1.0}]
  }}

获取文档的原始分片

如果要将原始分片提取到每个文档中,只需在fl 中指定[shard].作为查询的参数或在请求处理程序的默认值中,请参见下文.括号是强制性的,它们也会出现在结果响应中.

Fetch the origin shard of a document

If you want to fetch the originating shard into each document, you just need to specify [shard] within fl. Either as parameter with the query or within the requesthandler's defaults, see below. The brackets are mandatory, they will also be in the resulting response.

<requestHandler name="standard" class="solr.StandardRequestHandler" default="true">
  <lst name="defaults">
    <str name="defType">edismax</str>
    <str name="q.alt">*:*</str>
    <str name="qf">id title description abstractText</str>
    <str name="fl">*,score,[shard]</str>
    <str name="mm">100%</str>
  </lst>
</requestHandler>
<queryParser name="edismax" class="org.apache.solr.search.ExtendedDismaxQParserPlugin" />

工作示例

如果您想查看运行示例,请查看 github 上的 我的 solrsample 项目执行 ShardUnificationTest.我现在还包括了分片提取.

Working Sample

If you want to see a running example, checkout my solrsample project on github and execute the ShardUnificationTest. I have also included the shard-fetching by now.

这篇关于在solr中查询具有不同字段的多个集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆