现有重复索引时,numpy会保持分配顺序吗? [英] Will numpy keep order of assignment when existing duplicated indexes?
问题描述
我想按索引数组对数组进行赋值,但是索引重复.
I want to make an assignment to an array by index array, but there are duplicated indexes.
例如:
a = np.arange(5)
index = np.array([1,2,3,1,2,3,1,2,3])
b = np.arange(9)
a[index] = b
两个问题:
-
对于重复的索引,最新的分配是否总是生效?
For duplicated indexes, does the latest assignment always take effect?
在任何情况下都为a[1] == 6
,例如对于非常大的数组a
?可以a[1] == 0
或3
吗?
Is a[1] == 6
true for any case, e.g. for very large array a
? Is it possible a[1] == 0
or 3
?
更具体地说,我使用了用MKL(Anaconda提供)编译的numpy,一些数组操作是并行的.
More specifically, I used numpy compiled with MKL (provided by Anaconda), some array operation are in parallel.
相关文章:处理NumPy分配中的重复索引
如果上面的回答是否,是否有什么可以确保分配始终保持顺序?
If the answer above is no, is there anything can make sure that the assignment always keep in order?
推荐答案
这里是一种保证从相同索引的组中分配到最后一个索引的方法-
Here's one approach to guarantee the assignment into the last indices from the group of identical indices -
# Get sorting indices for index keeping the order with 'mergesort' option
sidx = index.argsort(kind='mergesort')
# Get sorted index array
sindex = index[sidx]
# Get the last indices from each group of identical indices in sorted version
idx = sidx[np.r_[np.flatnonzero(sindex[1:] != sindex[:-1]), index.size-1]]
# Use those last group indices to select indices off index and b to assign
a[index[idx]] = b[idx]
样品运行-
In [141]: a
Out[141]: array([0, 1, 2, 3, 4])
In [142]: index
Out[142]: array([1, 2, 3, 1, 2, 1, 2, 3, 4, 2])
In [143]: b
Out[143]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [144]: sidx = index.argsort(kind='mergesort')
...: sindex = index[sidx]
...: idx = sidx[np.r_[np.flatnonzero(sindex[1:] != sindex[:-1]), index.size-1]]
...: a[index[idx]] = b[idx]
...:
In [145]: a
Out[145]: array([0, 5, 9, 7, 8])
这篇关于现有重复索引时,numpy会保持分配顺序吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!