O(log n) 算法在排序数组中找到最佳插入位置 [英] O(log n) algorithm to find best insert position in sorted array

查看:19
本文介绍了O(log n) 算法在排序数组中找到最佳插入位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制定一个算法,找到将目标插入到已经排序的数组中的最佳位置.

I'm trying to make an algorithm that finds the best position to insert the target into the already sorted array.

目标是要么返回列表中存在的项目的位置,要么返回它进入以保持列表排序的位置.

The goal is to either return the position of the item if it exists in the list, else return the position it would go into to keep the list sorted.

所以说我有一个清单:

   0   1   2   3   4    5    6
 ---------------------------------
 | 1 | 2 | 4 | 9 | 10 | 39 | 100 |
 ---------------------------------

我的目标项目是 14它应该返回 5

And my target item is 14 It should return an index position of 5

我目前拥有的伪代码:

array = generateSomeArrayOfOrderedNumbers()

number findBestIndex(target, start, end)
    mid = abs(end - start) / 2

    if (mid < 2) 
        // Not really sure what to put here
        return start + 1 // ??

    if (target < array[mid])
        // The target belongs on the left side of our list //
        return findBestIndex(target, start, mid - 1)
    else
        // The target belongs on the right side of our list //
        return findBestIndex(target, mid + 1, end)

我现在不确定该放什么.我试图对此采用二分搜索方法,但这是我在 5 次左右重写后能想到的最好方法.

I not really sure what to put at this point. I tried to take a binary search approach to this, but this is the best I could come up with after 5 rewrites or so.

推荐答案

你的代码有几个问题:

mid = abs(end - start) / 2

不是startend 之间的中间,它是它们之间距离的一半(向下取整为整数).稍后您使用它就像它确实是一个有效的索引:

This is not the middle between start and end, it's half the distance between them (rounded down to an integer). Later you use it like it was indeed a valid index:

findBestIndex(target, start, mid - 1)

事实并非如此.您可能打算在这里使用 mid = (start + end)//2 或其他东西.你也错过了一些索引,因为你跳过了中间:

Which it is not. You probably meant to use mid = (start + end) // 2 or something here. You also miss a few indices because you skip over the mid:

return findBestIndex(target, start, mid - 1)
 ...
return findBestIndex(target, mid + 1, end)

现在,您的基本情况也必须有所不同.一个好的候选人是条件

Your base case must now be expressed a bit differently as well. A good candidate is the condition

if start == end

因为现在您肯定知道您已经完成了搜索.注意还要考虑所有数组元素都小于target的情况,所以需要在最后插入.

Because now you definitely know you're finished searching. Note that you also should consider the case where all the array elements are smaller than target, so you need to insert it at the end.

如果您以前从未做过二分搜索,则很难正确地进行二分搜索.如果我进行二分搜索,我通常使用以下模式:

Binary search is something that is surprisingly hard to get right if you've never done it before. I usually use the following pattern if I do a binary search:

lo, hi = 0, n // [lo, hi] is the search range, but hi will never be inspected.
while lo < hi:
    mid = (lo + hi) // 2
    if check(mid): hi = mid
    else:          lo = mid + 1

check 是一个单调二进制谓词的条件下(它总是 false 直到某个点和 true 从那个点开始),在这个循环之后,lo == hi 将是 [0..n] 范围内的第一个数字,check(lo) == true.check(n) 被隐式假定为真(这是这种方法的神奇之处).

Under the condition that check is a monotone binary predicate (it is always false up to some point and true from that point on), after this loop, lo == hi will be the first number in the range [0..n] with check(lo) == true. check(n) is implicitely assumed to be true (that's part of the magic of this approach).

那么,对于包括目标位置和目标位置之后的所有索引来说,它是 true 并且对于之前的所有位置都是 false 的单调谓词是什么?

So what is a monotone predicate that is true for all indices including and after our target position and false for all positions before?

如果我们考虑一下,我们想找到数组中第一个大于目标的数字,所以我们只需将其插入即可:

If we think about it, we want to find the first number in the array that is larger than our target, so we just plug that in and we're good to go:

lo, hi = 0, n
while lo < hi:
    mid = (lo + hi) // 2
    if (a[mid] > target): hi = mid
    else:                 lo = mid + 1
return lo;

这篇关于O(log n) 算法在排序数组中找到最佳插入位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆