我如何控制Sitecore的ContentSearch嵌套查询与Solr的提供者的优先级? [英] How do I control the priority of nested queries in Sitecore ContentSearch with the Solr Provider?

查看:711
本文介绍了我如何控制Sitecore的ContentSearch嵌套查询与Solr的提供者的优先级?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

版本详情:我与Sitecore的工作7.5建设141003,使用Solr的V4.7作为搜索引擎/索引服务器。 。我也使用没有自定义索引标准Sitecore的Solr的提供商



目标目标:
我使用Sitecore的ContentSearch LINQ与PredicateBuilder编译一些灵活的嵌套查询。目前,我需要一个特定的根项中进行搜索,同时排除与文件夹在他们的名字,也不包括在他们的路径/测试项目的模板。在某些时候,根项可能不止一个项目,所以可以在路径中包含(目前只是/测试。在这种情况下,这个想法是使用PredicateBuilder建外和有内在的上游或S为多个根项目S和路径排除



问题:
目前,我正在处理与有关嵌套和优先为这些谓词/条件的顺序问题。我一直在测试的几种方法和组合,但我一直运行到的问题是!TemplateName.Contains和项目[_完整路径。包含要优先于在Paths.Contains,它结束了每次造成0的结果。



我现在用的是Search.log检查查询输出,并且我对手动测试Solr的管理,运行对其进行查询,以比较结果。下面,你会发现我一直在使用Sitecore的LINQ的尝试组合的例子,和它们所产生的Solr的查询。



原始代码示例:



原始测试与根项目


$名单b $ b

  //有时为1,有时将有多个
变种rootItems =新的List<&ID GT; {} PATHID; //简化为1个项目,现在
VAR的查询= context.GetQueryable< SearchResultItem>();
VAR folderFilter = PredicateBuilder.True< SearchResultItem>()和(i =>!i.TemplateName.Contains(文件夹)及和放大器;!I [_完整路径]包含(/测试));
VAR pathFilter = PredicateBuilder.False< SearchResultItem>();
pathFilter = rootItems.Aggregate(pathFilter,(电流,ID)=> current.Or(ⅰ= GT; i.Paths.Contains(ID)));
folderFilter = folderFilter.And(pathFilter);
query.Filter(folderFilter).GetResults();



查询输出:(-_templatename: (*文件夹*)和-_fullpath:(* /测试*))AND _path:(730c169987a44ca7a9ce294ad7151f13)



正如你可以在上面的输出中看到,有一个内设置两个左右括号不包含的过滤器,优先路径之一。当我运行在Solr管理这个准确的查询,返回0结果。但是,如果我删除了内部括号所以它的所有单和set,它返回结果的预期。



我用不同的组合进一步测试这一点,并接近了PredicateBuilder,并以相同的查询中的每个组合的结果。我甚至尝试添加两个单独的滤波器(query.Filter(pred1).Filter(pred2)),以我的主查询对象,它导致相同的输出。



其他代码示例:



Alt键。 1 - 添加Paths.Contains到文件夹直接过滤

  VAR的查询= context.GetQueryable< ; SearchResultItem>(); 
VAR folderFilter = PredicateBuilder.True< SearchResultItem>()和(i =>!i.TemplateName.Contains(文件夹)及和放大器;!I [_完整路径]包含(/测试));
folderFilter = folderFilter.And(I => i.Paths.Contains(PATHID));
query.Filter(folderFilter).GetResults();



查询输出:(-_templatename: (*文件夹*)和-_fullpath:(* /测试*))AND _path:(730c169987a44ca7a9ce294ad7151f13)



选择2 - 两个谓词加入第一个

  VAR的查询= context.GetQueryable< SearchResultItem>(); 
VAR folderFilter = PredicateBuilder.True< SearchResultItem>()和(i =>!i.TemplateName.Contains(文件夹)及和放大器;!I [_完整路径]包含(/测试));
VAR pathFilter = PredicateBuilder.False< SearchResultItem>()或(I => i.Paths.Contains(PATHID));
folderFilter = folderFilter.And(pathFilter);
query.Filter(folderFilter).GetResults();



查询输出:(-_templatename: (*文件夹*)和-_fullpath:(* /测试*))AND _path:(730c169987a44ca7a9ce294ad7151f13)



Alt键3 - 两个内谓词,一个是不S,一个是路径加入外谓词

  VAR的查询= context.GetQueryable< SearchResultItem>(); 
VAR folderFilter = PredicateBuilder.True< SearchResultItem>()和(i =>!i.TemplateName.Contains(文件夹)及和放大器;!I [_完整路径]包含(/测试));
VAR pathFilter = PredicateBuilder.False< SearchResultItem>()或(I => i.Paths.Contains(PATHID));
变种finalPredicate = PredicateBuilder.True&所述; SearchResultItem方式>()和(folderFilter)。与(pathFilter);
query.Filter(finalPredicate).GetResults();



查询输出:(-_templatename: (*文件夹*)和-_fullpath:(* /测试*))AND _path:(730c169987a44ca7a9ce294ad7151f13)



结论:
最终,我所寻找的是一个方法来控制这些嵌套查询/条件的优先次序,不然我怎么可以建立他们首先把路径,而不是过滤器之后。正如前面提到的,有条件的地方,我们将有多个根项目和多路径排除,我需要查询更多的东西,如:




(-_templatename:(*文件夹*)和-_fullpath:(* /测试*)和
(_path:(730c169987a44ca7a9ce294ad7151f13)或
_path:(12c1aa7f60fa4e8d9f0a983bbbb40d8b)))







(-_ TEMPLATENAME:(*文件夹*)和-_fullpath (* /测试*)和
(_path:(730c169987a44ca7a9ce294ad7151f13)))




这两个查询返回的结果我希望/需要我的时候直接在Solr管理运行它们。不过,我似乎无法拿出一个方法或使用Sitecore的ContentSearch LINQ到输出的查询这样的操作顺序。



没有任何人有经验,我怎么能做到这一点?根据建议,我也愿意来组装这一块的查询,而不Sitecore的LINQ的,如果我可以称之为GetFacets和GetResults娶回了IQueryable的。



更新:
我并没有包括所有的我都做了,因为有可能会杀了我多久,这将得到修订。这么说,我也尝试了类似的结果别人对我原先的例子(顶部)另外一个微小的变化:

  VAR folderFilter = PredicateBuilder.True&所述; SearchResultItem方式>()和(I =>!i.TemplateName.Contains(文件夹))和(I =方式>!I [_完整路径]包含(/测试 )); 
变种rootItems =新的List<&ID GT; {PATHID,PATH2};
//或路径分别
VAR pathFilter = PredicateBuilder.False< SearchResultItem>();
pathFilter = rootItems.Aggregate(pathFilter,(电流,ID)=> current.Or(ⅰ= GT; i.Paths.Contains(ID)));
VAR finalPredicate = folderFilter.And(pathFilter);
VAR的查询= context.GetQueryable< SearchResultItem>();
query.Filter(finalPredicate).GetResults();



查询输出:((-_templatename (*文件夹*)和-_fullpath:(* /测试*))AND(_path:(730c169987a44ca7a9ce294ad7151f13)OR _path:(12c1aa7f60fa4e8d9f0a983bbbb40d8b)))



和它仍然围绕_templatename和_fullpath的条件,导致问题的内部括号。



感谢。


解决方案

好吧,我在这里提出这个问题,并张贴了情况Sitecore的支持,很好,我刚刚收到的响应和一些额外的信息。



< 据Solr的维基( http://wiki.apache.org/solr/FAQ ),在搜索部分,问题的为什么'foo和-baz'匹配文档,但Foo和(-bar)'不?答案为什么结果都回来了0




布尔查询必须至少有一个积极的表现(即;要或应该)为了配合。 Solr的试图解决这个问题,如果问执行BooleanQuery,做只包含否决条款的在最顶层的,它增加了一个匹配的所有文档的查询(如: 的)



如果顶层BoolenQuery包含在它的内部某处嵌套BooleanQuery只包含否定子句,该嵌套查询将不会被修改,并且(根据定义)的T匹配任何文件 - 如果需要,这意味着外部查询将不匹配。




我不知道完全正在做什么来构建在Sitecore的Solr的供应商查询,或为什么他们分组底片一起在一个嵌套查询,但底片嵌套查询只返回0,结果如预期,根据Solr的文档。诀窍的话,就是一个匹配所有查询添加(*:*)的子查询。



而不必为我的想到的任何查询手动完成的的可能会遇到这种情况,支持代表提供了一个补丁DLL来代替提供商,这将自动修改嵌套查询来解决这个问题。



他们也记录了此作为问题的错误,并提供参考号 398622



现在,生成的查询是这样的:

 (( -_templatename:(*文件夹*)和-_fullpath:(* /测试*)和*:*)和_path:(730c169987a44ca7a9ce294ad7151f13))

或为多个查询:

 ((-_ TEMPLATENAME:(*文件夹*)和-_fullpath:(* /测试*)和*:*)和(_path:(730c169987a44ca7a9ce294ad7151f13)OR _path:(12c1aa7f60fa4e8d9f0a983bbbb40d8b)))

和返回的结果如预期。如果任何人遇到这个问题,我会用与Sitecore的支持参考号码和看看他们是否可以提供这个补丁。您还需要更新您的Solr.Index和Solr.Indexes.Analytics配置文件中使用的供应商。


Version Details: I am working with Sitecore 7.5 build 141003, using Solr v4.7 as the search engine/indexing server. I am also using the standard Sitecore Solr provider with no custom indexers.

Target Goal: I am using Sitecore ContentSearch LINQ with PredicateBuilder to compile some flexible and nested queries. Currently, I need to search within a specific "Root item", while excluding templates with "folder" in their name, also excluding items with "/testing" in their path. At some point the "Root item" could be more than one item, and so could the path contains (currently just "/testing". In those cases, the idea is to use PredicateBuilder to build an outer "AND" predicate with inner "OR"s for the multiple "Root item"s and path exclusions.

Problem: At the moment, I am dealing with an issue regarding the order of nesting and priorities for these predicates/conditions. I have been testing several approaches and combinations, but the issue I keep running into is the !TemplateName.Contains and Item["_fullpath"].Contains being prioritized over the Paths.Contains, which ends up resulting in 0 results each time.

I am using the Search.log to check the query output, and I have been manually testing against the Solr admin, running queries against it to compare results. Below, you will find examples of the combinations I have tried using Sitecore Linq, and the queries they produce for Solr.

Original Code Sample:

Original test with List for root items

// sometimes will be 1, sometimes will be multiple
var rootItems = new List<ID> { pathID };  // simplified to 1 item for now
var query = context.GetQueryable<SearchResultItem>();
var folderFilter = PredicateBuilder.True<SearchResultItem>().And(i => !i.TemplateName.Contains("folder") && !i["_fullpath"].Contains("/testing"));
var pathFilter = PredicateBuilder.False<SearchResultItem>();
pathFilter = rootItems.Aggregate(pathFilter, (current, id) => current.Or(i => i.Paths.Contains(id)));
folderFilter = folderFilter.And(pathFilter);
query.Filter(folderFilter).GetResults();

Query output: (-_templatename:(*folder*) AND -_fullpath:(*/testing*)) AND _path:(730c169987a44ca7a9ce294ad7151f13)

As you can see in the above output, there is an inner set of parenthesis around the two "not contains" filters which takes precedence over the Path one. When I run this exact query in the Solr admin, it returns 0 results. However, if I remove the inner parenthesis so it's all a single "AND" set, it returns the results expected.

I tested this further with different combinations and approaches to the PredicateBuilder, and each combination results in the same query. I even tried adding two individual filters ("query.Filter(pred1).Filter(pred2)") to my main query object, and it results in the same output.

Additional Code Samples:

Alt. 1 - Adding "Paths.Contains" to folder filter directly

var query = context.GetQueryable<SearchResultItem>();
var folderFilter = PredicateBuilder.True<SearchResultItem>().And(i => !i.TemplateName.Contains("folder") && !i["_fullpath"].Contains("/testing"));
folderFilter = folderFilter.And(i => i.Paths.Contains(pathID));
query.Filter(folderFilter).GetResults();

Query output: (-_templatename:(*folder*) AND -_fullpath:(*/testing*)) AND _path:(730c169987a44ca7a9ce294ad7151f13)

Alt 2 - Two predicates joined to first

var query = context.GetQueryable<SearchResultItem>();
var folderFilter = PredicateBuilder.True<SearchResultItem>().And(i => !i.TemplateName.Contains("folder") && !i["_fullpath"].Contains("/testing"));
var pathFilter = PredicateBuilder.False<SearchResultItem>().Or(i => i.Paths.Contains(pathID));
folderFilter = folderFilter.And(pathFilter);
query.Filter(folderFilter).GetResults();

Query output: (-_templatename:(*folder*) AND -_fullpath:(*/testing*)) AND _path:(730c169987a44ca7a9ce294ad7151f13)

Alt 3 - Two "inner" predicates, one for "Not"s and one for "Paths" joined to an outer predicate

var query = context.GetQueryable<SearchResultItem>();
var folderFilter = PredicateBuilder.True<SearchResultItem>().And(i => !i.TemplateName.Contains("folder") && !i["_fullpath"].Contains("/testing"));
var pathFilter = PredicateBuilder.False<SearchResultItem>().Or(i => i.Paths.Contains(pathID));
var finalPredicate = PredicateBuilder.True<SearchResultItem>().And(folderFilter).And(pathFilter);
query.Filter(finalPredicate).GetResults();

Query output: (-_templatename:(*folder*) AND -_fullpath:(*/testing*)) AND _path:(730c169987a44ca7a9ce294ad7151f13)

Conclusion: Ultimately, what I am looking for is a way to control the prioritization of these nested queries/conditions, or how I can build them to put the paths first, and the "Not" filters after. As mentioned, there are conditions where we will have multiple "Root items" and multiple path exclusions where I need to query something more like:

(-_templatename:(*folder*) AND -_fullpath:(*/testing*) AND (_path:(730c169987a44ca7a9ce294ad7151f13) OR _path:(12c1aa7f60fa4e8d9f0a983bbbb40d8b)))

OR

(-_templatename:(*folder*) AND -_fullpath:(*/testing*) AND (_path:(730c169987a44ca7a9ce294ad7151f13)))

Both of these queries return the results I expect/need when I run them directly in the Solr admin. However, I cannot seem to come up with an approach or order of operations using Sitecore ContentSearch Linq to output a query this way.

Does anyone else have experience with how I can accomplish this? Depending on the suggestion, I am also willing to assemble this piece of the query without Sitecore Linq, if I can marry it back to the IQueryable for calling "GetFacets" and "GetResults".

Update: I didn't include all the revisions I have done because SO would probably kill me for how long this would get. That said, I did try one other slight variation on my original example (top) with a similar result as the others:

var folderFilter = PredicateBuilder.True<SearchResultItem>().And(i => !i.TemplateName.Contains("folder")).And(i => !i["_fullpath"].Contains("/testing"));
var rootItems = new List<ID> { pathID, path2 };
// or paths separately
var pathFilter = PredicateBuilder.False<SearchResultItem>();
pathFilter = rootItems.Aggregate(pathFilter, (current, id) => current.Or(i => i.Paths.Contains(id)));   
var finalPredicate = folderFilter.And(pathFilter);
var query = context.GetQueryable<SearchResultItem>();
query.Filter(finalPredicate).GetResults();

Query Output: ((-_templatename:(*folder*) AND -_fullpath:(*/testing*)) AND (_path:(730c169987a44ca7a9ce294ad7151f13) OR _path:(12c1aa7f60fa4e8d9f0a983bbbb40d8b)))

And it's still those inner parenthesis around the "_templatename" and "_fullpath" conditions that causes problems.

Thanks.

解决方案

Alright, I raised this question here and posted the situation to Sitecore support as well, and I just received a response and some additional information.

According to the Solr wiki (http://wiki.apache.org/solr/FAQ), in the "Searching" section, the question Why does 'foo AND -baz' match docs, but 'foo AND (-bar)' doesn't ? answers why the results are coming back 0.

Boolean queries must have at least one "positive" expression (ie; MUST or SHOULD) in order to match. Solr tries to help with this, and if asked to execute a BooleanQuery that does contains only negatived clauses at the topmost level, it adds a match all docs query (ie: :)

If the top level BoolenQuery contains somewhere inside of it a nested BooleanQuery which contains only negated clauses, that nested query will not be modified, and it (by definition) an't match any documents -- if it is required, that means the outer query will not match.

I am not sure of what entirely is being done to construct the query in the Sitecore Solr provider, or why they are grouping the negatives together in a nested query, but the nested query with negatives only is returning 0 results as expected, according to Solr doc. The trick, then, is to add a "match all" query (*:*) to the sub-query.

Instead of having to do this manually for any query that I think might encounter this situation, the support rep provided a patch DLL to replace the provider, that will automatically modify the nested query to remedy this.

They also logged this as a bug and provided reference number 398622 for the issue.

Now, the resulting query looks like this:

((-_templatename:(*folder*) AND -_fullpath:(*/testing*) AND *:*) AND _path:(730c169987a44ca7a9ce294ad7151f13))

or, for multiple queries:

((-_templatename:(*folder*) AND -_fullpath:(*/testing*) AND *:*) AND (_path:(730c169987a44ca7a9ce294ad7151f13) OR _path:(12c1aa7f60fa4e8d9f0a983bbbb40d8b)))

And the results return as expected. If anyone else comes across this, I would use the reference number with Sitecore support and see if they can provide the patch. You will also have to update the provider used in your Solr.Index and Solr.Indexes.Analytics config files.

这篇关于我如何控制Sitecore的ContentSearch嵌套查询与Solr的提供者的优先级?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆