[Theano]如何基于共享变量评估梯度 [英] [Theano]How to evaluate gradient based on shared variables
问题描述
我目前正面临这个问题: 我无法在用Theano编码的递归神经网络中评估我的梯度符号变量. 这是代码:
W_x = theano.shared(init_W_x, name='W_x')
W_h = theano.shared(init_W_h, name='W_h')
W_y = theano.shared(init_W_y, name='W_y')
[self.y, self.h], _ = theano.scan(self.step,
sequences=self.x,
outputs_info=[None, self.h0])
error = ((self.y - self.t) ** 2).sum()
gW_x, gW_y, gW_h = T.grad(self.error, [W_x, W_h, W_y])
[...]
def step(self, x_t, h_tm1):
h_t = T.nnet.sigmoid(T.dot(self.W_x, x_t) + T.dot(h_tm1, self.W_h))
y_t = T.dot(self.W_y, h_t)
return y_t, h_t
我保留了我认为适当的东西.
我希望能够计算例如'gW_x',但是当我尝试将其嵌入作为theano函数时,它却无法工作,因为它的依赖项(W_x,W_h,W_y)是共享变量.
非常感谢您
我认为在这种情况下,您需要将共享变量传递给theano.scan
的non_sequences
参数中的函数self.step
.>
因此,您需要更改self.step
的签名以采用与共享变量相对应的三个以上参数,然后将参数non_sequences=[W_x, W_h, W_y]
添加到theano.scan
.
此外,我怀疑您可能在倒数第二行中打了错字-应该是error = ((self.y - t) ** 2).sum()
吗?
I'm currently facing this issue: I can't manage to evaluate my gradient symbolic variables in a Recurrent Neural Network coded with Theano. Here's the code :
W_x = theano.shared(init_W_x, name='W_x')
W_h = theano.shared(init_W_h, name='W_h')
W_y = theano.shared(init_W_y, name='W_y')
[self.y, self.h], _ = theano.scan(self.step,
sequences=self.x,
outputs_info=[None, self.h0])
error = ((self.y - self.t) ** 2).sum()
gW_x, gW_y, gW_h = T.grad(self.error, [W_x, W_h, W_y])
[...]
def step(self, x_t, h_tm1):
h_t = T.nnet.sigmoid(T.dot(self.W_x, x_t) + T.dot(h_tm1, self.W_h))
y_t = T.dot(self.W_y, h_t)
return y_t, h_t
I kept just the things I thought were appropriate.
I would like to be able to compute for instance 'gW_x' but when I try to embbed it as a theano function it doesn't work because it's dependencies (W_x, W_h, W_y) are shared variables.
Thank you very much
I believe that in this instance, you need to pass the shared variables to the function self.step
in the non_sequences
argument of theano.scan
.
Therefore you need to change the signature of self.step
to take three more arguments, corresponding to the shared variables, and then add the argument non_sequences=[W_x, W_h, W_y]
to theano.scan
.
Also, I suspect you may have made a typo in the penultimate line - should it be error = ((self.y - t) ** 2).sum()
?
这篇关于[Theano]如何基于共享变量评估梯度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!