Numpy等效于list.index [英] Numpy equivalent of list.index

查看:57
本文介绍了Numpy等效于list.index的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在多次调用的低级函数中,我需要执行与python的list.index等效的操作,但要使用numpy数组.函数找到第一个值时需要返回,否则返回ValueError.像这样:

In a low-level function that is called many times, I need to do the equivalent of python's list.index, but with a numpy array. The function needs to return when it finds the first value, and raise ValueError otherwise. Something like:

>>> a = np.array([1, 2, 3])
>>> np_index(a, 1)
0
>>> np_index(a, 10)
Traceback (most recent call last):    
  File "<stdin>", line 1, in <module>
ValueError: 10 not in array

如果可能的话,我想避免Python循环. np.where不是一个选项,因为它总是遍历整个数组.找到第一个索引后,我需要停止的东西.

I want to avoid a Python loop if possible. np.where isn't an option as it always iterates through the entire array; I need something that stops once the first index is found.

编辑:一些与该问题有关的更具体的信息.

EDIT: Some more specific information related to the problem.

  • 大约90%的时间中,我要搜索的索引位于数组的前1/4至1/2中.因此,这里可能有2-4倍加速的风险.其余10%的时间该值根本不在数组中.

  • About 90% of the time, the index I'm searching for is in the first 1/4 to 1/2 of the array. So there's potentially a factor of 2-4 speedup at stake here. The other 10% of the time the value is not in the array at all.

我已经概要分析了事情,对np.where的调用是瓶颈,至少占总运行时间的50%.

I've profiled things already, and the call to np.where is the bottleneck, taking up at least 50% of the total runtime.

举起ValueError并不是必须的;它只需要返回明显表明该值不在数组中的内容即可.

It is not essential that it raise a ValueError; it just has to return something that obviously indicates that the value isn't in the array.

按照建议,我可能会在Cython中编写一个解决方案.

I will probably code up a solution in Cython, as suggested.

推荐答案

请参阅我对OP的注意事项的评论,但总的来说,我会执行以下操作:

See my comment on the OP's question for caveats, but in general, I would do the following:

import numpy as np
a = np.array([1, 2, 3])
np.min(np.nonzero(a == 2)[0])

如果您要查找的值不在数组中,则由于以下原因,您将得到ValueError:

if the value you are looking for is not in the array, you'll get a ValueError due to:

ValueError: zero-size array to ufunc.reduce without identity

因为您尝试获取空数组的最小值.

because you are trying to take the min value of an empty array.

我将分析此代码,看看它是否是一个实际的瓶颈,因为通常当numpy使用内置函数而不是显式python循环搜索整个数组时,它相对较快.在发现第一个值时坚持停止搜索可能在功能上不相关.

I would profile this code and see if it is an actual bottleneck, because in general when numpy searches through an entire array using a built-in function rather than an explicit python loop, it is relatively fast. An insistence on halting the search when it finds the first value may be functionally irrelevant.

这篇关于Numpy等效于list.index的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆