查找一个numpy数组的k个最小值的索引 [英] Find the index of the k smallest values of a numpy array
问题描述
为了找到最小值的索引,我可以使用argmin
:
In order to find the index of the smallest value, I can use argmin
:
import numpy as np
A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
print A.argmin() # 4 because A[4] = 0.1
我正在寻找类似的东西:
I'm looking for something like:
print A.argmin(numberofvalues=3)
# [4, 0, 7] because A[4] <= A[0] <= A[7] <= all other A[i]
注意:在我的用例中,A的值介于10000和100000之间,我只对k = 10个最小值的索引感兴趣. k永远不会大于10.
推荐答案
Use np.argpartition
. It does not sort the entire array. It only guarantees that the kth
element is in sorted position and all smaller elements will be moved before it. Thus the first k
elements will be the k-smallest elements.
import numpy as np
A = np.array([1, 7, 9, 2, 0.1, 17, 17, 1.5])
k = 3
idx = np.argpartition(A, k)
print(idx)
# [4 0 7 3 1 2 6 5]
这将返回最小的k个值.请注意,这些可能未按排序顺序.
This returns the k-smallest values. Note that these may not be in sorted order.
print(A[idx[:k]])
# [ 0.1 1. 1.5]
要获取k个最大值,请使用
To obtain the k-largest values use
idx = np.argpartition(A, -k)
# [4 0 7 3 1 2 6 5]
A[idx[-k:]]
# [ 9. 17. 17.]
警告:请勿(重复)使用idx = np.argpartition(A, k); A[idx[-k:]]
获得最大的k.
那并不总是有效的.例如,这些不是x
中的3个最大值:
WARNING: Do not (re)use idx = np.argpartition(A, k); A[idx[-k:]]
to obtain the k-largest.
That won't always work. For example, these are NOT the 3 largest values in x
:
x = np.array([100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 0])
idx = np.argpartition(x, 3)
x[idx[-3:]]
array([ 70, 80, 100])
这里是与np.argsort
的比较,它也可以工作,但只是对整个数组进行排序以获得结果.
Here is a comparison against np.argsort
, which also works but just sorts the entire array to get the result.
In [2]: x = np.random.randn(100000)
In [3]: %timeit idx0 = np.argsort(x)[:100]
100 loops, best of 3: 8.26 ms per loop
In [4]: %timeit idx1 = np.argpartition(x, 100)[:100]
1000 loops, best of 3: 721 µs per loop
In [5]: np.alltrue(np.sort(np.argsort(x)[:100]) == np.sort(np.argpartition(x, 100)[:100]))
Out[5]: True
这篇关于查找一个numpy数组的k个最小值的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!