“Slop"的确切含义在 Lucene SpanNearQuery 中(或在 ElasticSearch span_near 中倾斜) [英] Exact Meaning of "Slop" in Lucene SpanNearQuery (or slop in ElasticSearch span_near)

查看:22
本文介绍了“Slop"的确切含义在 Lucene SpanNearQuery 中(或在 ElasticSearch span_near 中倾斜)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题一:在Lucene的SpanNearQuery(或ElasticSearch中的span_near)中,具体含义是什么?slop?是分隔两个匹配词的词数,还是分隔词数加1?

Question 1: In Lucene's SpanNearQuery (or span_near in ElasticSearch), what is the exact meaning of slop? Is it the number of words separating the two matching words, or is it the separating number of words plus 1?

例如,假设您的索引文本是:foo bar biz

For example, suppose your indexed text is: foo bar biz

哪些查询会匹配此文本:"foo biz"~0, "foo biz"~1, "foo biz"~2

Which queries would match this text: "foo biz"~0, "foo biz"~1, "foo biz"~2

我希望第一个不匹配,最后一个匹配.但是中间呢?

I would expect that the first wouldn't match and the last would. But what about the middle?

问题 2: 现在是第二个更复杂的推论问题:如果有两个以上的搜索子句,如何处理 slop?它适用于每个对子句还是任何对子句.

Question 2: Now a second and more complex corollary question: how is slop handled if there are more than two search clauses? Is it applied to each pair of clauses or any pair of clauses.

例如,假设您构造了一个包含三个子句的 SpanNearQuery:foobarbiz.匹配上面相同的索引文本需要什么斜率?我希望 2 肯定会出现一些问题,但是 01 呢?

For example, suppose you construct a SpanNearQuery with three clauses: foo, bar, biz. What slop is needed to match the same indexed text above? I would expect a slop of 2 definitely would, but what about 0 or 1?

同样,同样的三子句查询,需要什么slop来匹配文本:foo bar ble biz

Similarly, with the same three clause query, what slop is needed to match the text: foo bar ble biz

推荐答案

问题一: Slop 是分隔 span 子句的单词数.所以斜率 0 意味着它们是相邻的.在我给出的示例中,slop of 1 将匹配.

Question 1: Slop is the number of words separating the span clauses. So slop 0 would mean they are adjacent. In the example I gave, slop of 1 would match.

问题 2: 当有两个以上的 span Near 子句时,每个子句必须与至少一个其他子句通过不超过分隔它们的 slop 词连接 AND所有子句必须通过链相互连接.但是,每个子句之间不必用杂词隔开.

Question 2: When there are more than two span near clauses, each clause must be connected to at least one other clause by no more than slop words separating them AND all of the clauses must be connected to each other through a chain. However, each clause need not be separated by slop words to every other clause.

对于问题 2 中的第一个示例:0、1 和 2 的斜率都将匹配.即使 foobiz 被不止一个分隔,零匹配的斜率也是如此,因为所有子句都有一个链.

For the first example in question 2: slop of 0, 1, and 2 would all match. Slop of zero matches even though foo and biz are separated by more than one because there is a chain through all clauses.

对于问题 2 中的第二个示例:0 的斜率将不匹配,因为 biz 与所有其他子句之间的间隔超过 0 斜率.Slop of 1 将匹配,因为 foobar 被 0 slop 分隔,另外 barbiz 被分隔1 斜率.即使 foobiz 被不止一个分隔,它也匹配,因为所有子句都有一条链.2 的斜率显然会匹配.

For the second example in question 2: slop of 0 would not match because biz is separated from all other clauses by more than 0 slop. Slop of 1 would match because foo and bar are separated by 0 slop, in addition bar and biz are separated by 1 slop. It matches even though foo and biz are separated by more than one because there is a chain through all clauses. Slop of 2 would obviously match.

这篇关于“Slop"的确切含义在 Lucene SpanNearQuery 中(或在 ElasticSearch span_near 中倾斜)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆