试图了解Pytorch对LSTM的实施 [英] Trying to understand Pytorch's implementation of LSTM

查看:124
本文介绍了试图了解Pytorch对LSTM的实施的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含1000个示例的数据集,其中每个示例具有 5 个功能(a,b,c,d,e)。我想向LSTM提供 7 示例,以便它预测第8天的功能(a)。

I have a dataset containing 1000 examples where each example has 5 features (a,b,c,d,e). I want to feed 7 examples to an LSTM so it predicts the feature (a) of the 8th day.

阅读nn的Pytorchs文档。 LSTM()我提出了以下内容:

Reading Pytorchs documentation of nn.LSTM() I came up with the following:

input_size = 5
hidden_size = 10
num_layers = 1
output_size = 1

lstm = nn.LSTM(input_size, hidden_size, num_layers)
fc = nn.Linear(hidden_size, output_size)

out, hidden = lstm(X)  # Where X's shape is ([7,1,5])
output = fc(out[-1])

output  # output's shape is ([7,1])

根据文档:

nn.LSTM的输入为 形状的输入( seq_len,批处理,input_size ),其中包含 input_size –预期数量输入x 中的特征,

The input of the nn.LSTM is "input of shape (seq_len, batch, input_size)" with "input_size – The number of expected features in the input x",

输出为:形状的输出( seq_len,batch,num_directions * hidden_​​size ):张量包含来自LSTM最后一层的输出特征(h_t),例如ch t。

And the output is: "output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the LSTM, for each t."

在这种情况下,我认为 seq_len 将是7个示例的序列, batch 为1,而 input_size 为5。因此,lstm将使用每个包含5个要素的示例,这些要素在每次迭代时都重新提供隐藏层。

In this case, I thought seq_len would be the sequence of 7 examples, batchis 1 and input_size is 5. So the lstm would consume each example containing 5 features refeeding the hidden layer every iteration.

我缺少什么?

推荐答案

当我扩展您的代码时举一个完整的例子-我还添加了一些评论可能对您有所帮助-我得到以下信息:

When I extend your code to a full example -- I also added some comments to may help -- I get the following:

import torch
import torch.nn as nn

input_size = 5
hidden_size = 10
num_layers = 1
output_size = 1

lstm = nn.LSTM(input_size, hidden_size, num_layers)
fc = nn.Linear(hidden_size, output_size)

X = [
    [[1,2,3,4,5]],
    [[1,2,3,4,5]],
    [[1,2,3,4,5]],
    [[1,2,3,4,5]],
    [[1,2,3,4,5]],
    [[1,2,3,4,5]],
    [[1,2,3,4,5]],
]

X = torch.tensor(X, dtype=torch.float32)

print(X.shape)         # (seq_len, batch_size, input_size) = (7, 1, 5)
out, hidden = lstm(X)  # Where X's shape is ([7,1,5])
print(out.shape)       # (seq_len, batch_size, hidden_size) = (7, 1, 10)
out = out[-1]          # Get output of last step
print(out.shape)       # (batch, hidden_size) = (1, 10)
out = fc(out)          # Push through linear layer
print(out.shape)       # (batch_size, output_size) = (1, 1)

鉴于您的 batch_size = 1 output_size = 1 (我假设您正在进行回归)。我不知道您的 output.shape =(7,1)来自哪里。

This makes sense to me, given your batch_size = 1 and output_size = 1 (I assume, you're doing regression). I don't know where your output.shape = (7, 1) come from.

您确定吗您的 X 具有正确的尺寸?您是否用 batch_first = True 创建了 nn.LSTM ?有很多小东西可以潜入。

Are you sure that your X has the correct dimensions? Did you create nn.LSTM maybe with batch_first=True? There are lot of little things that can sneak in.

这篇关于试图了解Pytorch对LSTM的实施的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆