使用numpy向量化方​​程式 [英] Vectorising an equation using numpy

查看:87
本文介绍了使用numpy向量化方​​程式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将上述公式实现为矢量化形式. K=3在这里,X150x4 numpy数组. mu3x4 numpy数组. Gamma150x3 numpy数组. Sigmakx4x4 numpy数组.因此Sigma[k]4x4 numpy数组. N=150

I am trying to implement the above formula as a vectorised form. K=3 here, X is 150x4 numpy array. mu is 3x4 numpy array. Gamma is a 150x3 numpy array. Sigma is a kx4x4 numpy array. Therefore Sigma[k] is a 4x4 numpy array. N=150

N_k = np.sum(Gamma, axis=0)
for k in range(K): # Correct
         x_new = X - mu[k] #Correct
         a = np.dot(x_new.T, x_new) #Incorrect from here I feel
         for i in range(len(data)):
             sigma[k] = Gamma[i][k] * a
         sigma[k]=sigma[k]/N_k #totally incorrect

该如何解决?

推荐答案

产品总和?听起来像是 np.einsum 的工作:

A sum of products? sounds like a job for np.einsum:

import numpy as np
N = 150
K = 3
M = 4
x = np.random.random((N,M))
mu = np.random.random((K,M))
gamma = np.random.random((N,K))

xbar = x-mu[:,None,:] # shape (3, 150, 4)
sigma = np.einsum('nk,knm,kno->kmo', gamma, xbar, xbar)
sigma /= gamma.sum(axis=0)[:,None,None]


解码'nk,knm,kno->kmo' :


Decoding 'nk,knm,kno->kmo':

此下标规范在数组(->)的左侧具有三个组成部分,在数组的右侧具有一个组成部分.

This subscript specification has three components to the left of the array (->) followed by one component to the right.

左侧的三个部分与gammaxbarxbar的下标相对应,操作数被传递给np.einsum.

The three components on the left correspond with the subscripts for gamma, xbar and xbar, the operands being passed to np.einsum.

gamma具有下标nk,就像您发布的公式中一样. xbar具有形状(3,150,4).您可以认为它具有下标knm,其中kn的含义与您发布的公式中的含义相同,并且m是表示长度为4的轴的下标,在下面未明确提及您的公式,但是显然可以找到您对数组形状的描述.

gamma has subscript nk, just as in the formula you posted. xbar has shape (3, 150, 4). You can think of it as having subscript knm, where k and n have the same meaning as in the formula you posted, and m is a subscript representing the axis of length 4 which is not explicitly mentioned in your formula, but is apparently there given your description of the shape of the arrays.

现在,第三个下标组件为kno.使用o下标是因为om下标具有相同的作用,但是我们不想对m 求和.实际上,我们希望独立地迭代mo下标,而不是逐步进行迭代.因此,我们给第三个下标不同的字母.

Now the third subscript component is kno. The o subscript is used because the o plays the same role as the m subscript, but we don't want summation over m. In fact, we want the m and o subscripts to be iterated over independently, not in lock-step. Hence we give the third subscript different letters.

请注意,n出现在左侧的下标(nk, knm, kno)中,但没有出现在右侧的kmo中.这告诉np.einsumn求和.

Notice that n appears in the subscripts on the left (nk, knm, kno), but does not appear on the right (in kmo). That tells np.einsum to sum over n.

k出现在左侧的的下标中.这告诉np.einsum我们希望逐步进行k下标的操作,但是(由于它出现在右侧)我们不希望对k求和.

The k appears in the subscripts on the left and on the right. This tells np.einsum that we wish to advance the k subscript in lock-step, but (since it appears on the right) we don't want to sum over k.

由于kmo出现在右侧,因此这些下标仍保留在结果中.这导致sigma具有形状(K,M,M)(即(3,4,4)).

Since kmo appears on the right, these subscripts remain in the result. This results in sigma having shape (K,M,M) (i.e. (3,4,4)).

这篇关于使用numpy向量化方​​程式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆