numpy中的功能分配 [英] Functional assignment in numpy

查看：104 发布时间：2020/5/18 21:15:09 python numpy

本文介绍了numpy中的功能分配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有两个数组

A = [ 6, 4, 5, 7, 9 ]
ind = [ 0, 0, 2, 1, 2 ]

和函数f.

我想构建一个新数组B，其大小为ind中与B [i]相同的in的不同元素的数量，f的结果为参数i由i索引的A子数组.

I want to build a new array B of size the number of distinct elements in ind with B[i] the result of f with parameter the subarray of A indexed by i.

在此示例中，如果我取f =和，则

For this example, if I take f = sum, then

B = [10, 7, 14]

或f = max

B = [6, 7, 9]

是否有比numpy中的for循环更有效的方法?

Is there a more efficient way than a for loop in numpy ?

谢谢

推荐答案

对于f = sum的特殊情况:

In [32]: np.bincount(ind,A)
Out[32]: array([ 10.,   7.,  14.])

假设:

Assuming:

f是ufunc
您有足够的内存来制作2D 形状为len(A) x len(A)

f is a ufunc
You have enough memory to make a 2D array of shape len(A) x len(A)

您可以制作2D数组B:

B=np.zeros((len(A),max(ind)+1))

并用A中的值填充B中的各个位置，以便B的第一列仅在ind == 0时获得A的值，而B的第二列仅获得A时来自A的值，等等:

and fill in various locations in B with values from A, such that the first column of B only gets values from A when ind == 0, and the second column of B only gets values from A when ind == 1, etc:

B[zip(*enumerate(ind))]=A

您最终会得到一个类似的数组

you'd end up with an array like

[[ 6.  0.  0.]
 [ 4.  0.  0.]
 [ 0.  0.  5.]
 [ 0.  7.  0.]
 [ 0.  0.  9.]]

然后您可以沿轴= 0施加f以获得所需的结果. 这里有第三个假设:

You could then apply f along axis=0 to obtain your desired result. There is a third assumption used here:

B中的多余零不影响预期的结果.

The extra zeros in B do not affect the desired result.

如果您可以忍受这些假设，那么:

If you can stomach these assumptions then:

import numpy as np

A = np.array([ 6, 4, 5, 7, 9 ])
ind = np.array([ 0, 0, 2, 1, 2 ])

N=100
M=10
A2 = np.array([np.random.randint(M) for i in range(N)])
ind2 = np.array([np.random.randint(M) for i in range(N)])

def use_extra_axis(A,ind,f):
    B=np.zeros((len(A),max(ind)+1))
    B[zip(*enumerate(ind))]=A
    return f(B)

def use_loop(A,ind,f):
    n=max(ind)+1
    B=np.empty(n)
    for i in range(n):
        B[i]=f(A[ind==i])
    return B

def fmax(arr):
    return np.max(arr,axis=0)

if __name__=='__main__':
    print(use_extra_axis(A,ind,fmax))
    print(use_loop(A,ind,fmax))

对于M和N的某些值(例如M = 10，N = 100)，使用额外的轴可能比使用循环更快:

For certain values of M and N (e.g. M=10, N=100), using an extra axis may be faster than using a loop:

% python -mtimeit -s'import test,numpy' 'test.use_extra_axis(test.A2,test.ind2,test.fmax)'
10000 loops, best of 3: 162 usec per loop

% python -mtimeit -s'import test,numpy' 'test.use_loop(test.A2,test.ind2,test.fmax)'
1000 loops, best of 3: 222 usec per loop

但是，随着N变大(例如M = 10，N = 10000)，使用循环可能会更快:

However, as N grows larger (say M=10, N=10000), using a loop may be faster:

% python -mtimeit -s'import test,numpy' 'test.use_extra_axis(test.A2,test.ind2,test.fmax)'
100 loops, best of 3: 13.9 msec per loop
% python -mtimeit -s'import test,numpy' 'test.use_loop(test.A2,test.ind2,test.fmax)'
100 loops, best of 3: 4.4 msec per loop

结合使用稀疏矩阵的 Thuis的绝妙想法:

def use_sparse_extra_axis(A,ind,f):
    B=scipy.sparse.coo_matrix((A, (range(len(A)), ind))).toarray()
    return f(B)

def use_sparse(A,ind,f):
    return [f(v) for v in scipy.sparse.coo_matrix((A, (ind, range(len(A))))).tolil().data]

哪种实现最好取决于参数N和M:

Which implementation is best depends on the parameters N and M:

N=1000, M=100
·───────────────────────·────────────────────·
│ use_sparse_extra_axis │ 1.15 msec per loop │
│        use_extra_axis │ 2.79 msec per loop │
│              use_loop │ 3.47 msec per loop │
│            use_sparse │ 5.25 msec per loop │
·───────────────────────·────────────────────·

N=100000, M=10
·───────────────────────·────────────────────·
│ use_sparse_extra_axis │ 35.6 msec per loop │
│              use_loop │ 43.3 msec per loop │
│            use_sparse │ 91.5 msec per loop │
│        use_extra_axis │  150 msec per loop │
·───────────────────────·────────────────────·

N=100000, M=50
·───────────────────────·────────────────────·
│            use_sparse │ 94.1 msec per loop │
│              use_loop │  107 msec per loop │
│ use_sparse_extra_axis │  170 msec per loop │
│        use_extra_axis │  272 msec per loop │
·───────────────────────·────────────────────·

N=10000, M=50
·───────────────────────·────────────────────·
│              use_loop │ 10.9 msec per loop │
│            use_sparse │ 11.7 msec per loop │
│ use_sparse_extra_axis │ 15.1 msec per loop │
│        use_extra_axis │ 25.4 msec per loop │
·───────────────────────·────────────────────·

这篇关于numpy中的功能分配的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

numpy中的功能分配 [英] Functional assignment in numpy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

numpy中的功能分配 [英] Functional assignment in numpy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭