ValueError:预期的目标大小(128,44),得到了torch.Size([128,100]),LSTM Pytorch [英] ValueError: Expected target size (128, 44), got torch.Size([128, 100]), LSTM Pytorch

查看:59
本文介绍了ValueError:预期的目标大小(128,44),得到了torch.Size([128,100]),LSTM Pytorch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想建立一个模型,该模型根据先前的字符来预测下一个字符.我已经将文本拼接成长度为100的整数序列(使用数据集和数据加载器).

I want to build a model, that predicts next character based on the previous characters. I have spliced text into sequences of integers with length = 100(using dataset and dataloader).

我的输入变量和目标变量的维数是:

Dimensions of my input and target variables are:

inputs dimension: (batch_size,sequence length). In my case (128,100)
targets dimension: (batch_size,sequence length). In my case (128,100)

在向前通过之后,我得到了预测的维度:(batch_size,sequence_length,vocabulary_size)在我的情况下是(128,100,44)

After forward pass I get dimension of my predictions: (batch_size, sequence_length, vocabulary_size) which is in my case (128,100,44)

但是当我使用 nn.CrossEntropyLoss()函数计算损失时:

but when I calculate my loss using nn.CrossEntropyLoss() function:

batch_size = 128
sequence_length   = 100
number_of_classes = 44
# creates random tensor of your output shape
output = torch.rand(batch_size,sequence_length, number_of_classes)
# creates tensor with random targets
target = torch.randint(number_of_classes, (batch_size,sequence_length)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

我得到一个错误:

ValueError: Expected target size (128, 44), got torch.Size([128, 100])

问题是:对于多对多LSTM预测,我应该如何处理损失函数的计算?特别是序列尺寸?根据 nn.CrossEntropyLoss Dimension must be(N,C,d1,d2 ... dN),其中N是batch_size,C-类数.但是D是什么?它与序列长度有关吗?

Question is: how should I handle calculation of the loss function for many-to-many LSTM prediction? Especially sequence dimension? According to nn.CrossEntropyLoss Dimension must be(N,C,d1,d2...dN), where N is batch_size,C - number of classes. But what is D? Is it related to sequence length?

推荐答案

作为一般性评论,我只想说你问了许多不同的问题,这使某人难以回答.我建议每个StackOverflow帖子只问一个问题,即使那意味着要发表多个帖子.我将仅回答我认为您正在询问的主要问题:为什么我的代码崩溃了并且如何解决?"希望可以解决您的其他问题.

As a general comment, let me just say that you have asked many different questions, which makes it difficult for someone to answer. I suggest asking just one question per StackOverflow post, even if that means making several posts. I will answer just the main question that I think you are asking: "why is my code crashing and how to fix it?" and hopefully that will clear up your other questions.

根据您的代码,模型的输出尺寸为(128,100,44)=(N,D,C).这里N是最小批量大小,C是类数,D是输入的维数.您正在使用的交叉熵损失期望输出的尺寸为(N,C,D),目标的尺寸为(N,D).要清除说明(N,C,D1,D2,...,Dk)的文档,请记住您的输入可以是任意维度的任意张量.在您的情况下,输入的长度为100,但没有什么可以阻止某人以100x100的图像作为输入来制作模型.(在这种情况下,损耗会期望输出具有维度(N,C,100、100).)但是在您的情况下,您的输入是一维的,因此输入长度只有一个D = 100.

Per your code, the output of your model has dimensions (128, 100, 44) = (N, D, C). Here N is the minibatch size, C is the number of classes, and D is the dimensionality of your input. The cross entropy loss you are using expects the output to have dimension (N, C, D) and the target to have dimension (N, D). To clear up the documentation that says (N, C, D1, D2, ..., Dk), remember that your input can be an arbitrary tensor of any dimensionality. In your case inputs have length 100, but nothing is to stop someone from making a model with, say, a 100x100 image as input. (In that case the loss would expect output to have dimension (N, C, 100, 100).) But in your case, your input is one dimensional, so you have just a single D=100 for the length of your input.

现在我们看到错误,输出应该是(N,C,D),但您的应该是(N,D,C).您的目标的正确尺寸为(N,D).您有两种方法可以解决此问题.首先是更改网络的结构,以使其输出为(N,C,D),这可能很简单,也可能并不简单,或者您希望在模型的上下文中实现.第二种选择是在使用 torch.transpose https://pytorch.org/docs/stable/genic/torch.transpose.html

Now we see the error, outputs should be (N, C, D), but yours is (N, D, C). Your targets have the correct dimensions of (N, D). You have two paths the fix the issue. First is to change the structure of your network so that its output is (N, C, D), this may or may not be easy or what you want in the context of your model. The second option is to transpose your axes at the time of loss computation using torch.transpose https://pytorch.org/docs/stable/generated/torch.transpose.html

batch_size = 128
sequence_length   = 100
number_of_classes = 44
# creates random tensor of your output shape (N, D, C)
output = torch.rand(batch_size,sequence_length, number_of_classes)
# transposes dimensionality to (N, C, D)
tansposed_output = torch.transpose(output, 1, 2)
# creates tensor with random targets
target = torch.randint(number_of_classes, (batch_size,sequence_length)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()
loss = criterion(transposed_output, target)
print(loss)

这篇关于ValueError:预期的目标大小(128,44),得到了torch.Size([128,100]),LSTM Pytorch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆