类似于Python的二进制搜索功能,可在排序列表中查找大于特定值的第一个数字 [英] Python binary search-like function to find first number in sorted list greater than a specific value

查看:142
本文介绍了类似于Python的二进制搜索功能,可在排序列表中查找大于特定值的第一个数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用Python编写一个函数,该函数在排序列表中找到第一个数字,该数字大于我作为参数传入的特定值.我在网上找到了使用简单列表推导来实现此目的的示例,但出于我的目的,我需要经常在大型列表上执行此操作,因此以线性时间运行的搜索太昂贵了.

I'm trying to write a function in Python that finds the first number in a sorted list greater than a specific value that I pass in as an argument. I've found examples online that use simple list comprehensions to achieve this, but for my purposes I need to be performing this operation frequently and on large lists, so a search that runs in linear time is too expensive.

尽管在遇到一些无法正常工作的极端情况下,但我在编写类似于迭代式二进制搜索的函数时还是遇到了麻烦.顺便说一句,不需要该功能来处理列表中没有较大项目的情况.这是我现有的功能:

I've had a crack at writing an iterative binary search-like function to achieve this, though I'm coming across some edge cases where it doesn't work correctly. By the way, the function is not required to deal with a case where there is no larger item in the list. Here is my existing function:

def findFirstLarger(num, sortedList):
    low = 0; 
    high = len(sortedList) - 1

    mid = -1
    while True:
        print("low: " + str(low) + "\t high: " + str(high))
        if (low > high):
            print("Ah geez, low is " + str(low) + " and high is " + str(high))
            return # debugging, don't want this to happen
        if low == high:
            return sortedList[low]
        else:
            mid = (low + high) / 2;
            if num == sortedList[mid]:
                return sortedList[mid]
            elif num > sortedList[mid]:
                low = mid + 1
            else:
                high = mid - 1

我已经注意到该功能不起作用的一种情况如下:

One case I have noted where this function does not work is as follows:

>>> somenumbers=[n*2 for n in range(131072)]
>>> somenumbers[-5:]
[262134, 262136, 262138, 262140, 262142]


>>> binsearch.findFirstLarger(262139,somenumbers)
low: 0   high: 131071
low: 65536   high: 131071
low: 98304   high: 131071
low: 114688  high: 131071
low: 122880  high: 131071
low: 126976  high: 131071
low: 129024  high: 131071
low: 130048  high: 131071
low: 130560  high: 131071
low: 130816  high: 131071
low: 130944  high: 131071
low: 131008  high: 131071
low: 131040  high: 131071
low: 131056  high: 131071
low: 131064  high: 131071
low: 131068  high: 131071
low: 131070  high: 131071
low: 131070  high: 131069
Ah geez, low is 131070 and high is 131069

此处正确的结果应为262140,因为这是列表中第一个大于262139的数字.

Here the correct result would be 262140, as this is the first number in the list greater than 262139.

有人可以推荐一个更清洁的实施方式来实际起作用吗?我不认为这会是一个深奥的问题,尽管到目前为止我还无法在任何地方找到解决方案.

Can anyone recommend a cleaner implementation of this that actually works? I didn't think this would be such an esoteric problem, though I haven't been able to find a solution anywhere as of yet.

推荐答案

您是否尝试过 模块?

def find_ge(a, key):
    '''Find smallest item greater-than or equal to key.
    Raise ValueError if no such item exists.
    If multiple keys are equal, return the leftmost.

    '''
    i = bisect_left(a, key)
    if i == len(a):
        raise ValueError('No item found with key at or above: %r' % (key,))
    return a[i]

find_ge(somenumbers, 262139)


您的代码是错误的,因为(1)low > high是有效的终止条件. (2)您不应在low == high处停留,例如num == 3为您的somenumbers时,它将返回错误的索引.


Your code is wrong that (1) low > high is a valid termination case. (2) you should not stop at low == high, e.g. it will return an incorrect index when num == 3 for your somenumbers.

这篇关于类似于Python的二进制搜索功能,可在排序列表中查找大于特定值的第一个数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆