类似于Python的二进制搜索功能,可在排序列表中查找大于特定值的第一个数字 [英] Python binary search-like function to find first number in sorted list greater than a specific value
问题描述
我试图用Python编写一个函数,该函数在排序列表中找到第一个数字,该数字大于我作为参数传入的特定值.我在网上找到了使用简单列表推导来实现此目的的示例,但出于我的目的,我需要经常在大型列表上执行此操作,因此以线性时间运行的搜索太昂贵了.
I'm trying to write a function in Python that finds the first number in a sorted list greater than a specific value that I pass in as an argument. I've found examples online that use simple list comprehensions to achieve this, but for my purposes I need to be performing this operation frequently and on large lists, so a search that runs in linear time is too expensive.
尽管在遇到一些无法正常工作的极端情况下,但我在编写类似于迭代式二进制搜索的函数时还是遇到了麻烦.顺便说一句,不需要该功能来处理列表中没有较大项目的情况.这是我现有的功能:
I've had a crack at writing an iterative binary search-like function to achieve this, though I'm coming across some edge cases where it doesn't work correctly. By the way, the function is not required to deal with a case where there is no larger item in the list. Here is my existing function:
def findFirstLarger(num, sortedList):
low = 0;
high = len(sortedList) - 1
mid = -1
while True:
print("low: " + str(low) + "\t high: " + str(high))
if (low > high):
print("Ah geez, low is " + str(low) + " and high is " + str(high))
return # debugging, don't want this to happen
if low == high:
return sortedList[low]
else:
mid = (low + high) / 2;
if num == sortedList[mid]:
return sortedList[mid]
elif num > sortedList[mid]:
low = mid + 1
else:
high = mid - 1
我已经注意到该功能不起作用的一种情况如下:
One case I have noted where this function does not work is as follows:
>>> somenumbers=[n*2 for n in range(131072)]
>>> somenumbers[-5:]
[262134, 262136, 262138, 262140, 262142]
>>> binsearch.findFirstLarger(262139,somenumbers)
low: 0 high: 131071
low: 65536 high: 131071
low: 98304 high: 131071
low: 114688 high: 131071
low: 122880 high: 131071
low: 126976 high: 131071
low: 129024 high: 131071
low: 130048 high: 131071
low: 130560 high: 131071
low: 130816 high: 131071
low: 130944 high: 131071
low: 131008 high: 131071
low: 131040 high: 131071
low: 131056 high: 131071
low: 131064 high: 131071
low: 131068 high: 131071
low: 131070 high: 131071
low: 131070 high: 131069
Ah geez, low is 131070 and high is 131069
此处正确的结果应为262140
,因为这是列表中第一个大于262139
的数字.
Here the correct result would be 262140
, as this is the first number in the list greater than 262139
.
有人可以推荐一个更清洁的实施方式来实际起作用吗?我不认为这会是一个深奥的问题,尽管到目前为止我还无法在任何地方找到解决方案.
Can anyone recommend a cleaner implementation of this that actually works? I didn't think this would be such an esoteric problem, though I haven't been able to find a solution anywhere as of yet.
推荐答案
您是否尝试过
def find_ge(a, key):
'''Find smallest item greater-than or equal to key.
Raise ValueError if no such item exists.
If multiple keys are equal, return the leftmost.
'''
i = bisect_left(a, key)
if i == len(a):
raise ValueError('No item found with key at or above: %r' % (key,))
return a[i]
find_ge(somenumbers, 262139)
您的代码是错误的,因为(1)low > high
是有效的终止条件. (2)您不应在low == high
处停留,例如num == 3
为您的somenumbers
时,它将返回错误的索引.
Your code is wrong that (1) low > high
is a valid termination case. (2) you should not stop at low == high
, e.g. it will return an incorrect index when num == 3
for your somenumbers
.
这篇关于类似于Python的二进制搜索功能,可在排序列表中查找大于特定值的第一个数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!