Numpy等效于list.index [英] Numpy equivalent of list.index
问题描述
在多次调用的低级函数中,我需要执行与python的list.index等效的操作,但要使用numpy数组.函数找到第一个值时需要返回,否则返回ValueError.像这样:
In a low-level function that is called many times, I need to do the equivalent of python's list.index, but with a numpy array. The function needs to return when it finds the first value, and raise ValueError otherwise. Something like:
>>> a = np.array([1, 2, 3])
>>> np_index(a, 1)
0
>>> np_index(a, 10)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 10 not in array
如果可能的话,我想避免Python循环. np.where
不是一个选项,因为它总是遍历整个数组.找到第一个索引后,我需要停止的东西.
I want to avoid a Python loop if possible. np.where
isn't an option as it always iterates through the entire array; I need something that stops once the first index is found.
编辑:一些与该问题有关的更具体的信息.
EDIT: Some more specific information related to the problem.
-
大约90%的时间中,我要搜索的索引位于数组的前1/4至1/2中.因此,这里可能有2-4倍加速的风险.其余10%的时间该值根本不在数组中.
About 90% of the time, the index I'm searching for is in the first 1/4 to 1/2 of the array. So there's potentially a factor of 2-4 speedup at stake here. The other 10% of the time the value is not in the array at all.
我已经概要分析了事情,对np.where
的调用是瓶颈,至少占总运行时间的50%.
I've profiled things already, and the call to np.where
is the bottleneck, taking up at least 50% of the total runtime.
举起ValueError
并不是必须的;它只需要返回明显表明该值不在数组中的内容即可.
It is not essential that it raise a ValueError
; it just has to return something that obviously indicates that the value isn't in the array.
按照建议,我可能会在Cython中编写一个解决方案.
I will probably code up a solution in Cython, as suggested.
推荐答案
请参阅我对OP的注意事项的评论,但总的来说,我会执行以下操作:
See my comment on the OP's question for caveats, but in general, I would do the following:
import numpy as np
a = np.array([1, 2, 3])
np.min(np.nonzero(a == 2)[0])
如果您要查找的值不在数组中,则由于以下原因,您将得到ValueError
:
if the value you are looking for is not in the array, you'll get a ValueError
due to:
ValueError: zero-size array to ufunc.reduce without identity
因为您尝试获取空数组的最小值.
because you are trying to take the min value of an empty array.
我将分析此代码,看看它是否是一个实际的瓶颈,因为通常当numpy使用内置函数而不是显式python循环搜索整个数组时,它相对较快.在发现第一个值时坚持停止搜索可能在功能上不相关.
I would profile this code and see if it is an actual bottleneck, because in general when numpy searches through an entire array using a built-in function rather than an explicit python loop, it is relatively fast. An insistence on halting the search when it finds the first value may be functionally irrelevant.
这篇关于Numpy等效于list.index的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!