理解一个简单的 LSTM pytorch [英] Understanding a simple LSTM pytorch

查看:31
本文介绍了理解一个简单的 LSTM pytorch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

导入torch,ipdb将 torch.autograd 导入为 autograd将 torch.nn 导入为 nn导入 torch.nn.functional 作为 F导入 torch.optim 作为 optim从 torch.autograd 导入变量rnn = nn.LSTM(input_size=10, hidden_​​size=20, num_layers=2)输入 = 变量(torch.randn(5, 3, 10))h0 = 变量(torch.randn(2, 3, 20))c0 = 变量(torch.randn(2, 3, 20))输出,hn = rnn(输入,(h0,c0))

这是文档中的 LSTM 示例.我不知道了解以下几点:

  1. 什么是输出大小,为什么没有在任何地方指定?
  2. 为什么输入有 3 个维度.5 和 3 代表什么?
  3. h0 和 c0 中的 2 和 3 是什么,它们代表什么?

导入torch,ipdb将 torch.autograd 导入为 autograd将 torch.nn 导入为 nn导入 torch.nn.functional 作为 F导入 torch.optim 作为 optim从 torch.autograd 导入变量导入 torch.nn.functional 作为 Fnum_layers=3num_hyperparams=4批次 = 1隐藏尺寸 = 20rnn = nn.LSTM(input_size=num_hyperparams, hidden_​​size=hidden_​​size, num_layers=num_layers)input = Variable(torch.randn(1, batch, num_hyperparams)) # (seq_len, batch, input_size)h0 = 变量(torch.randn(num_layers,batch,hidden_​​size))#(num_layers,batch,hidden_​​size)c0 = 变量(torch.randn(num_layers,batch,hidden_​​size))输出,hn = rnn(输入,(h0,c0))affine1 = nn.Linear(hidden_​​size, num_hyperparams)ipdb.set_trace()打印 output.size()打印 h0.size()

<块引用>

*** 运行时错误:预期的矩阵,在

处得到 3D、2D 张量

解决方案

LSTM 的输出是最后一层所有隐藏节点的输出.
hidden_​​size - 每层 LSTM 块的数量.
input_size - 每个时间步长的输入特征数.
num_layers - 隐藏层的数量.
总共有 hidden_​​size * num_layers 个 LSTM 块.

输入维度为(seq_len, batch, input_size).
seq_len - 每个输入流中的时间步数.
batch - 每批输入序列的大小.

隐藏和单元格尺寸为:(num_layers, batch, hidden_​​size)

<块引用>

output (seq_len, batch, hidden_​​size * num_directions):包含来自 RNN 最后一层的输出特征 (h_t) 的张量,对于每个 t.

所以会有 hidden_​​size * num_directions 输出.您没有将 RNN 初始化为双向,所以 num_directions 是 1.所以 output_size = hidden_​​size.

编辑:您可以使用线性层更改输出数量:

out_rnn, hn = rnn(input, (h0, c0))lin = nn.Linear(hidden_​​size, output_size)v1 = nn.View(seq_len*batch, hidden_​​size)v2 = nn.View(seq_len, batch, output_size)输出 = v2(lin(v1(out_rnn)))

注意:对于这个答案,我假设我们只讨论非双向 LSTM.

来源:PyTorch 文档.

import torch,ipdb
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable

rnn = nn.LSTM(input_size=10, hidden_size=20, num_layers=2)
input = Variable(torch.randn(5, 3, 10))
h0 = Variable(torch.randn(2, 3, 20))
c0 = Variable(torch.randn(2, 3, 20))
output, hn = rnn(input, (h0, c0))

This is the LSTM example from the docs. I don't know understand the following things:

  1. What is output-size and why is it not specified anywhere?
  2. Why does the input have 3 dimensions. What does 5 and 3 represent?
  3. What are 2 and 3 in h0 and c0, what do those represent?

Edit:

import torch,ipdb
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
import torch.nn.functional as F

num_layers=3
num_hyperparams=4
batch = 1
hidden_size = 20
rnn = nn.LSTM(input_size=num_hyperparams, hidden_size=hidden_size, num_layers=num_layers)

input = Variable(torch.randn(1, batch, num_hyperparams)) # (seq_len, batch, input_size)
h0 = Variable(torch.randn(num_layers, batch, hidden_size)) # (num_layers, batch, hidden_size)
c0 = Variable(torch.randn(num_layers, batch, hidden_size))
output, hn = rnn(input, (h0, c0))
affine1 = nn.Linear(hidden_size, num_hyperparams)

ipdb.set_trace()
print output.size()
print h0.size()

*** RuntimeError: matrices expected, got 3D, 2D tensors at

解决方案

The output for the LSTM is the output for all the hidden nodes on the final layer.
hidden_size - the number of LSTM blocks per layer.
input_size - the number of input features per time-step.
num_layers - the number of hidden layers.
In total there are hidden_size * num_layers LSTM blocks.

The input dimensions are (seq_len, batch, input_size).
seq_len - the number of time steps in each input stream.
batch - the size of each batch of input sequences.

The hidden and cell dimensions are: (num_layers, batch, hidden_size)

output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t.

So there will be hidden_size * num_directions outputs. You didn't initialise the RNN to be bidirectional so num_directions is 1. So output_size = hidden_size.

Edit: You can change the number of outputs by using a linear layer:

out_rnn, hn = rnn(input, (h0, c0))
lin = nn.Linear(hidden_size, output_size)
v1 = nn.View(seq_len*batch, hidden_size)
v2 = nn.View(seq_len, batch, output_size)
output = v2(lin(v1(out_rnn)))

Note: for this answer I assumed that we're only talking about non-bidirectional LSTMs.

Source: PyTorch docs.

这篇关于理解一个简单的 LSTM pytorch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆