Solr高亮显示的字段/片段带有ANY术语,而不是完全满足查询要求的字段/片段 [英] Solr highlighting gives field/snippets with ANY term, instead of those that satisfy the query fully

查看:155
本文介绍了Solr高亮显示的字段/片段带有ANY术语,而不是完全满足查询要求的字段/片段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Solr 5.x(标准荧光笔),即使我指示q.op = AND,我也获得的片段仅与其中一个搜索词匹配. 我只需要匹配所有术语的字段和代码段(除非我说q.op = OR或只是忽略它),即字段/代码段必须满足查询条件. Solr会返回包含所有条件的字段/代码段,但也会返回许多其他条件.

I'm using Solr 5.x, standard highlighter, and i'm getting snippets which matches even one of the search terms only, even if i indicate q.op=AND. I need ONLY the fields and snippets that matches ALL the terms (unless i say q.op=OR or just omit it), i.e. the field/snippet must satisfy the query. Solr does return the field/snippet that has all the terms, but also return many others.

我正在使用hl.fl = *来获取唯一具有术语的字段,并针对默认字段(包含完整文档的文本")进行搜索.需要使用*,因为我有多个动态字段.大多数字段是"text_general"类型(用于搜索和HL),而某些字段是"string"类型,用于构面.

I'm using hl.fl=*, to get the only fields having the terms, and searching against the default field ('text' containing full doc). Need to use * since i have multiple dynamic fields. Most fields are 'text_general' type (for search and HL), and some are 'string' type for faceting.

如果代码段无法包含所有术语,则我必须仅获取完全满足查询条件的字段(因为问题更多是关于匹配所有术语,但是搜索查询可能会变得非常复杂,因此字段/代码段应匹配查询).

If its not possible for snippets to have all the terms, i MUST get only the fields that satisfy the query fully (since the question is more talking about matching all the terms, but the search query can become arbitrarily complex, so the fields/snippets should match the query).

此外,接下来是使用基于邻近度的搜索/术语突出显示摘要.我应该怎么做/使用它?在这种情况下,突出显示的字段也应满足邻近查询(与我得到的包含任何术语的字段无关,而无需考虑邻近约束和其他查询术语等)

Also, next is to get snippets highlighted with proximity based search/terms. What should i do/use for this? The fields coming in highlighting in this scenario should also satisfy the proximity query (unlike i get a field that contain any term, without regard to proximity constrains and other query terms etc)

感谢您的帮助.

推荐答案

我也遇到了与突出显示相同的问题.就我而言,查询就像

I've also encountered the same problem with highlighting. In my case, the query like

(foo AND bar) OR eggs

突出显示的 eggs foo .我没有设法提出适当的解决方案,但是我设计了一个肮脏的解决方法.

highlighted eggs and foo despite bar was not present in the document. I didn't manage to come up with proper solution, however I devised a dirty workaround.

我使用以下查询:

id:highlighted_document_id AND text:(my_original_query)

,其中debugQuery设置为 true .然后,我为highlighted_document_id解析explain文本.文本包含查询中的词条,这些词条对得分有所帮助.解释中不存在不应突出显示的术语.

with debugQuery set to true. Then I parse explain text for highlighted_document_id. The text contains the terms from the query, which have contributed to the score. The terms, which should not be highlighted, are not present in the explanation.

我用来提取术语的Python regex表达式(对Solr 5.2.1有效):

The Python regex expressions I use to extract the terms (valid for Solr 5.2.1):

term_regex = re.compile(r'weight\(text:(.+) in') wildcard_term_regex = re.compile(r'text:(.+), product')

term_regex = re.compile(r'weight\(text:(.+) in') wildcard_term_regex = re.compile(r'text:(.+), product')

然后,我只搜索突出显示的文本中的标记,如果该术语与term_regexwildcard_term_regex中的任何术语都不匹配,则将其删除.

then I simply search the markings in the highlighted text and remove them if the term doesn't match against any of the term in term_regex and wildcard_term_regex.

该解决方案可能非常有限,但对我有用.

The solution is probably pretty limited, but works for me.

这篇关于Solr高亮显示的字段/片段带有ANY术语,而不是完全满足查询要求的字段/片段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆