简化循环的numpy操作 [英] Simplifying looped numpy operations

查看:332
本文介绍了简化循环的numpy操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习有效地在python中实现各种神经网络,目前正在尝试实现该模型

I am trying to learn to efficiently implement various neural nets in python and am currently trying to implement this model

.

但是,我在使用numpy操作来实现求和时遇到了麻烦.

However, I am having trouble using numpy operations to implement the summation.

我一直在关注现有的实现,并试图对其进行简化,但是对我来说,尚不清楚所有正在执行的阵列操作将实现什么.我的解释是C通过R的每一列相乘并求和.但是,我的einsum实现np.einsum('ijk,km->ij', C, R)似乎没有产生所需的结果.

I have been following this existing implementation and am trying to simplify it, but it's not entirely clear to me what all of the array operations being performed are achieving. My interpretation is that the C's are multiplied through each of the columns of R and summed. However, my einsum implementation np.einsum('ijk,km->ij', C, R) doesn't seem to produce the required result.

我将对简化此实现的一些建议表示赞赏.我目前的尝试是使用np.einsum,但是到目前为止我还没有找到任何地方.

I would appreciate some pointers towards simplifying this implementation. My current attempts have been to use np.einsum but that hasn't gotten me anywhere so far.

要简化的代码(在图片/第一个链接中说明):

Code to simplify (explained in image/first link):

batchsize = X.shape[0]
R = self.R
C = self.C
bw = self.bw

# Obtain word features
tmp = R.as_numpy_array()[:,X.flatten()].flatten(order='F')
tmp = tmp.reshape((batchsize, self.K * self.context))
words = np.zeros((batchsize, self.K, self.context))
for i in range(batchsize):
    words[i,:,:] = tmp[i,:].reshape((self.K, self.context), order='F')
words = gpu.garray(words)

# Compute the hidden layer (predicted next word representation)
acts = gpu.zeros((batchsize, self.K))
for i in range(self.context):
    acts = acts + gpu.dot(words[:,:,i], C[i,:,:])

推荐答案

创建小型words:

In [565]: words = np.zeros((2,3,4))
In [566]: tmp = np.arange(2*3*4).reshape((2,3*4))
In [567]: for i in range(2):
     ...:     words[i,:,:] = tmp[i,:].reshape((3,4),order='F')
     ...:     
In [568]: tmp
Out[568]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
In [569]: words
Out[569]: 
array([[[  0.,   3.,   6.,   9.],
        [  1.,   4.,   7.,  10.],
        [  2.,   5.,   8.,  11.]],

       [[ 12.,  15.,  18.,  21.],
        [ 13.,  16.,  19.,  22.],
        [ 14.,  17.,  20.,  23.]]])

我很确定这可以在没有循环的情况下完成

I'm pretty sure this can be done without the loop

In [577]: C = np.ones((4,3,3))
In [578]: acts = np.zeros((2,3))
In [579]: for i in range(4):
     ...:     acts += np.dot(words[:,:,i], C[i,:,:])
     ...:     
In [580]: acts
Out[580]: 
array([[  66.,   66.,   66.],
       [ 210.,  210.,  210.]])

dot循环可以在einsum中表示为:

This dot loop can be expressed in einsum as:

In [581]: np.einsum('ijk,kjm->im', words, C)
Out[581]: 
array([[  66.,   66.,   66.],
       [ 210.,  210.,  210.]])

这是对jk的求和.在循环版本中,j上的总和是在dot中完成的,而k上的总和是在循环中完成的.但是对于非常大的阵列,并且使用gpu加速,循环版本可能会更快.如果问题空间太大,则einsum可能会变慢,甚至遇到内存错误(尽管最新版本有一些优化选项).

This is summing on j and k. In the loop version the sum on j was done in the dot,and the sum on k was done in the loop. But for very large arrays, and with gpu speedup, the loop version might be faster. If the problem space gets too big, einsum can be slow and even hit memory errors (though the newest version has some optimization options).

words而不使用循环:

In [585]: tmp.reshape(2,3,4, order='F')
Out[585]: 
array([[[ 0,  3,  6,  9],
        [ 1,  4,  7, 10],
        [ 2,  5,  8, 11]],

       [[12, 15, 18, 21],
        [13, 16, 19, 22],
        [14, 17, 20, 23]]])

这篇关于简化循环的numpy操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆