在pytorch中使用前馈网络构建递归神经网络 [英] Building recurrent neural network with feed forward network in pytorch

查看:36
本文介绍了在pytorch中使用前馈网络构建递归神经网络的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在经历教程.我对以下课程代码有疑问:

I was going through this tutorial. I have a question about the following class code:

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size

        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax()

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def init_hidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

此代码摘自此处.有人提到

由于网络状态保存在图形中而不是图层中,因此您可以简单地创建一个nn.Linear并将其重复使用以重复使用.

Since the state of the network is held in the graph and not in the layers, you can simply create an nn.Linear and reuse it over and over again for the recurrence.

我不明白的是,如何才能以nn.Linear为单位增加输入要素的大小,并说这是RNN.我在这里想念什么?

What I don't understand is, how can one just increase input feature size in nn.Linear and say it is a RNN. What am I missing here?

推荐答案

该网络是经常性的,因为您在示例中评估了多个时间步长.以下代码也来自您链接的 pytorch教程到.

The network is recurrent, because you evaluate multiple timesteps in the example. The following code is also taken from the pytorch tutorial you linked to.

loss_fn = nn.MSELoss()

batch_size = 10
TIMESTEPS = 5

# Create some fake data
batch = torch.randn(batch_size, 50)
hidden = torch.zeros(batch_size, 20)
target = torch.zeros(batch_size, 10)
loss = 0
for t in range(TIMESTEPS):
    # yes! you can reuse the same network several times,
    # sum up the losses, and call backward!
    hidden, output = rnn(batch, hidden)
    loss += loss_fn(output, target)
loss.backward()

因此网络本身不是递归的,但是在此循环中,您可以通过多次馈送上一个前进步骤的隐藏状态以及批输入来将其用作递归网络.

So the network itself is not recurrent, but in this loop you use it as a recurrent network by feeding the hidden state of the previous forward step together with your batch-input multiple times.

您还可以通过在每个步骤中反向传播损失并忽略隐藏状态来非周期性地使用它.

You could also use it non-recurrent by just backpropagating the loss in every step and ignoring the hidden state.

由于网络状态保存在图形中而不是图层中,因此您可以简单地创建一个nn.Linear并将其重复使用以重复使用.

Since the state of the network is held in the graph and not in the layers, you can simply create an nn.Linear and reuse it over and over again for the recurrence.

这意味着,计算梯度的信息未保存在模型本身中,因此您可以将模块的多个评估附加到图形中,然后在整个图形中反向传播.本教程前面的段落中对此进行了描述.

This means, that the information to compute the gradient is not held in the model itself, so you can append multiple evaluations of the module to the graph and then backpropagate through the full graph. This is described in the previous paragraphs of the tutorial.

这篇关于在pytorch中使用前馈网络构建递归神经网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆