对于二进制搜索,有没有比中点更有效的搜索因子? [英] Is there a more efficient search factor than midpoint for binary search?

查看:66
本文介绍了对于二进制搜索,有没有比中点更有效的搜索因子?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

朴素的二进制搜索是一种非常有效的算法:您将高点和低点的中点放在已排序的数组中,并相应地调整高点或低点.然后,您重新计算端点并进行迭代,直到找到目标值为止(当然,您也没有找到目标值)

The naive binary search is a very efficient algorithm: you take the midpoint of your high and low points in a sorted array and adjust your high or low point accordingly. Then you recalculate your endpoint and iterate until you find your target value (or you don't, of course.)

现在,很明显,如果您不使用中点,则会给系统带来一些风险.假设您将搜索目标从中点移开,然后创建了两个面-我将它们称为大面"和小面". (转移是向高还是向低无所谓,因为它是对称的.)冒着的风险是,如果您错过了,搜索空间就会比原来大:您必须搜索大的一面更大.但奖励是,如果您命中了您,搜索空间就较小.

Now, quite clearly, if you don't use the midpoint, you introduce some risk to the system. Let's say you shift your search target away from the midpoint and you create two sides - I'll call them a big side and small side. (It doesn't matter whether the shift is toward high or low, because it would be symmetrical.) The risk is that if you miss, your search space is bigger than it would be: you've got to search the big side which is bigger. But the reward is that if you hit your search space is smaller.

在我看来,风险与奖励空间的数量相同,并且(没有模式,我假设没有模式)元素高于和低于中点的可能性相等.因此,风险在于它介于新目标和中点之间.

It occurs to me that the number of spaces being risked vs rewarded is the same, and (without patterns, which I'm assuming there are none) the likelihood of an element being higher and lower than the midpoint is equal. So the risk is that it falls between the new target and the midpoint.

现在,因为空格的数量会影响搜索空间,并且搜索空间是对数测量的,所以在我看来,如果我使用了,比如说搜索空间的1/4和3/4,我已经削减了对数小空间的一半,大空间只增加了约0.6或.7.

Now because the number of spaces affects the search space, and the search space is measured logrithmically, it seems to me if I used, let's say 1/4 and 3/4 for our search spaces, I've cut the log of the small space in half, where the large space has only gone up in by about .6 or .7.

因此,请牢记所有这些:是否有比仅使用中点更有效的执行二进制搜索的方法?

So with all this in mind: is there a more efficient way of performing a binary search than just using the midpoint?

推荐答案

让我们同意搜索关键字同样有可能位于数组中的位置—否则,我们希望基于对以下内容的特殊了解来设计一种算法那个地点.因此,我们只能选择每次拆分数组的位置.如果我们选择数字0< x < 1并在此处拆分数组,它在左边的几率是x,在右边的几率是1-x.在第一种情况下,我们将数组缩短x倍,在第二种情况下,将数组缩短1-x倍.如果我们多次这样做,我们将得到许多这些因素的乘积,因此这里要使用的正确"平均值是几何平均值.从这个意义上讲,每步的平均减少量是权重为x的x和权重为1-x的1-x,总共为x ^ x *(1-x)^(1-x).

Let's agree that the search key is equally likely to be at position in the array—otherwise, we'd want to design an algorithm based on our special knowledge of the location. So all we can choose is where to split the array each time. If we choose a number 0 < x < 1 and split the array there, the chance that it's on the left is x and the chance that it's on the right is 1-x. In the first case we shorten the array by a factor of x and in the second by a factor of 1-x. If we did this many times we'd have a product of many of these factors, and so the 'right' average to use here is the geometric mean. In that sense, the average decrease per step is x with weight x and 1-x with weight 1-x, for a total of x^x * (1-x)^(1-x).

那么何时将其最小化?如果这是数学堆栈交换,我们将采用导数(以及乘积规则,链规则和指数规则),将它们设置为零,然后求解.但这是stackoverflow,因此我们将其绘制成图形:

So when is this minimized? If this were the math stackexchange, we'd take derivatives (with the product rule, chain rule, and exponent rule), set them to zero, and solve. But this is stackoverflow, so instead we graph it:

您可以看到,距1/2的距离越远,您得到的效果越差.为了更好地理解,我推荐信息理论或微积分,它们对此具有有趣且互补的观点.

You can see that the further you get from 1/2, the worse you get. For a better understanding I recommend information theory or calculus which have interesting and complementary perspectives on this.

这篇关于对于二进制搜索,有没有比中点更有效的搜索因子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆