Lucene中布尔查询的局限性是什么? [英] What are the limitations of boolean query in Lucene?

查看:100
本文介绍了Lucene中布尔查询的局限性是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在Lucene索引中找到具有两个基本条件的项目: 1.匹配一个称为关系"的特定字符串 2.属于权利授予组"列表

I have a requirement to find items in a Lucene index that have two basic criterion: 1. match a specific string called a 'relation' 2. fall within a list of entitlement 'grant groups'

权利组定义了该组成员可访问的项的子集,非常类似于授权角色.

An entitlement group defines a subset of items accessible by a member of that group and is much like an authorization role.

Lucene索引中的所有文档都具有关系"字段,为简单起见,还具有一个或多个授予组"字段.

All documents in the Lucene index have the 'relation' field and, for simplicity sake, one or more 'grant-group' fields.

因此,例如,用户可以搜索"foobar",并且该用户可以是a,b,c组的成员.假设foobar具有授予组a,p,q,s

So, for example, a user may search for 'foobar' and that user may be a member of groups a, b, c. foobar, let's say, has grant groups a,p,q,s

查询基本上是匹配'foobar'AND(a OR b OR c).

The query will be, basically, "match 'foobar' AND (a OR b OR c).

这应该根据Lucene文档进行.

This should work according to Lucene documentation.

我的问题是:布尔查询的第二部分,即"AND"之后的部分,能走多远?提出这个问题的原因是:我将做一个小的可行性研究,部分要求是需要在或"子句中支持可能的许多团体.最多可能有200或300个组.

My question is this: How far can you go with the 2nd part of the boolean query, namely, the part after 'AND' ? The reason for asking is this: I am about to do a small feasibility study and part of the requirements is the need to support potentially MANY groups in the 'OR' clause. Possibly up to 200 or 300 groups.

性能会明显下降吗?

谢谢.

推荐答案

来自此 lucene性能概述:

换一种说法:对于标准的析取(OR'd)查询,子句的数量并不会真正影响性能,除非更多的文档可能匹配.

To put it another way: for standard disjunctive (OR'd) queries, the number of clauses doesn't really affect performance, except to the extent that more documents are potential matches.

如Avi所述,您将达到1024个子句的限制.

As Avi mentioned, you will hit a limit at 1024 clauses.

这篇关于Lucene中布尔查询的局限性是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆