向量化3D数组的NumPy协方差 [英] Vectorizing NumPy covariance for 3D array
问题描述
我有一个形状为(t,n1,n2)
的3D numpy数组:
I have a 3D numpy array of shape (t, n1, n2)
:
x = np.random.rand(10, 2, 4)
I需要计算另一个形状为(t,n1,n1)
的3D数组 y
:
I need to calculate another 3D array y
which is of shape (t, n1, n1)
such that:
y[0] = np.cov(x[0,:,:])
...等等,沿第一个轴的所有切片都如此。
...and so on for all slices along the first axis.
因此,循环实现将是:
y = np.zeros((10,2,2))
for i in np.arange(x.shape[0]):
y[i] = np.cov(x[i, :, :])
有没有什么方法可以对此向量化,这样我就可以一次性计算所有协方差矩阵?我试着做:
Is there any way to vectorize this so I can calculate all covariance matrices in one go? I tried doing:
x1 = x.swapaxes(1, 2)
y = np.dot(x, x1)
但这没有用。
推荐答案
numpy.cov源代码
,并尝试使用默认参数。事实证明, np.cov(x [i,:,:])
将很简单:
N = x.shape[2]
m = x[i,:,:]
m -= np.sum(m, axis=1, keepdims=True) / N
cov = np.dot(m, m.T) /(N - 1)
因此,任务是对这个循环进行矢量化处理,该循环将遍历 i
并处理来自 x
的所有数据一口气。同样,我们可以在第三步中使用 broadcasting
。对于最后一步,我们将沿第一轴上的所有切片执行总和减少
。可以使用 np.einsum
以矢量化的方式有效地实现这一点。因此,最终的实现方式是-
So, the task was to vectorize this loop that would iterate through i
and process all of the data from x
in one go. For the same, we could use broadcasting
at the third step. For the final step, we are performing sum-reduction
there along all slices in first axis. This could be efficiently implemented in a vectorized manner with np.einsum
. Thus, the final implementation came to this -
N = x.shape[2]
m1 = x - x.sum(2,keepdims=1)/N
y_out = np.einsum('ijk,ilk->ijl',m1,m1) /(N - 1)
运行时测试
In [155]: def original_app(x):
...: n = x.shape[0]
...: y = np.zeros((n,2,2))
...: for i in np.arange(x.shape[0]):
...: y[i]=np.cov(x[i,:,:])
...: return y
...:
...: def proposed_app(x):
...: N = x.shape[2]
...: m1 = x - x.sum(2,keepdims=1)/N
...: out = np.einsum('ijk,ilk->ijl',m1,m1) / (N - 1)
...: return out
...:
In [156]: # Setup inputs
...: n = 10000
...: x = np.random.rand(n,2,4)
...:
In [157]: np.allclose(original_app(x),proposed_app(x))
Out[157]: True # Results verified
In [158]: %timeit original_app(x)
1 loops, best of 3: 610 ms per loop
In [159]: %timeit proposed_app(x)
100 loops, best of 3: 6.32 ms per loop
那里的加速很大!
这篇关于向量化3D数组的NumPy协方差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!