SOLR不搜索某些字段 [英] SOLR not searching on certain fields

查看:487
本文介绍了SOLR不搜索某些字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

刚刚安装了Solr,编辑了 schema.xml ,现在我正在尝试对其进行索引并使用一些测试数据对其进行搜索。

Just installed Solr, edited the schema.xml, and am now trying to index it and search on it with some test data.

在我发送给Solr的XML文件中,我的一个字段看起来像这样:

In the XML file I'm sending to Solr, one of my fields look like this:

<field name="PageContent"><![CDATA[<p>some text in a paragrah tag</p>]]></field>

那里有HTML,所以我把它包装在CDATA中。

There's HTML there, so I've wrapped it in CDATA.

在我的Solr schema.xml 中,该字段的定义如下所示:

In my Solr schema.xml, the definition for that field looks like this:

<field name="PageContent" type="text" indexed="true" stored="true"/>

当我运行POSTing工具时,一切正常,但是当我搜索我知道的内容时在 PageContent 字段中,我没有得到任何结果。

When I ran the POSTing tool, everything went ok, but when I search for content which I know is inside the PageContent field, I get no results.

但是,当我设置时< defaultSearchField> 节点到 PageContent ,它有效。但是,如果我将其设置为任何其他字段,则不会搜索 PageContent

However, when I set the <defaultSearchField> node to PageContent, it works. But if I set it to any other field, it doesn't search in PageContent.

我在做什么错误?有什么问题?

Am I doing something wrong? what's the issue?

澄清错误:

我上传了一个包含以下数据的doc:

I've uploaded a "doc" with the following data:

<field name="PageID">928</field>
<field name="PageName">some name</field>
<field name="PageContent"><![CDATA[<p>html content</p>]]></field>

在我的架构中我已经定义了这样的字段:

In my schema I've defined the fields as such:

<field name="PageID" type="integer" indexed="true" stored="true" required="true"/>
<field name="PageName" type="text" indexed="true" stored="true"/>
<field name="PageContent" type="text" indexed="true" stored="true"/>

并且:

<uniqueKey>PageID</uniqueKey>
<defaultSearchField>PageName</defaultSearchField>

现在,当我使用Solr管理工具并搜索某个名称时我得到了一个结果。但是,如果我搜索 html content html content 928 ,我没有结果

Now, when I use the Solr admin tool and search for "some name" I get a result. But, if I search for "html content", "html", "content" or "928", I get no results

为什么?

推荐答案

您提到您的默认搜索字段设置为PageName,我不希望搜索内容返回任何内容。

You mentioned that your default search field is set to PageName, I wouldn't expect a search for "content" to return anything.

您可能想在搜索框中输入PageContent:content来查找该字段中的数据。如果你想搜索多个字段,你需要查看 http://wiki.apache。组织/ Solr的/ DisMaxRequestHandler 。 solr管理控制台不是一个可以使用所有DisMax搜索选项的工具,你只想操纵它的URL。

You probably meant to put "PageContent:content" in the search box to find data in that field. If you want to search against multiple fields you'll want to check this out http://wiki.apache.org/solr/DisMaxRequestHandler. The solr admin console is not that great of a tool to play around with all the DisMax search options, you'll want to just manipulate the URL for that.

无论如何,我同意上一张海报,如果你的分析设置没有正确设置来处理HTML,你可能会得到各种意想不到的搜索结果。仅删除HTML和索引文本。

Regardless, I agree with the previous poster, if your analysis setup isn't setup up properly to deal with HTML you are likely to get all sorts of unexpected search results. Strip the HTML out and index text only.

如果您希望标准查询处理程序搜索所有字段,可以在solrconfig.xml中更改它(我总是添加第二个查询处理程序而不是修改标准.qf字段是您要搜索的字段列表。它是一个以空格分隔的列表。

If you want your standard query handler to search against all your fields you can change it in your solrconfig.xml (I always add a second query handler instead of modifying "standard". The qf field is the list of fields you want to search against. It's a space separated list.

<requestHandler name="standard" class="solr.DisMaxRequestHandler">

     <lst name="defaults">
            <str name="echoParams">all</str>
            <str name="hl">true</str>

            <str name="fl">*</str>
            <str name="qf">PageName PageContent</str>
     </lst>

 </requestHandler>

这篇关于SOLR不搜索某些字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆