Pytorch LSTM:计算交叉熵损失时的目​​标尺寸 [英] Pytorch LSTM: Target Dimension in Calculating Cross Entropy Loss

查看:137
本文介绍了Pytorch LSTM:计算交叉熵损失时的目​​标尺寸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试在Pytorch中使用LSTM(在自定义模型中在LSTM之后是线性层),但是在计算损耗时却遇到了以下错误:

I've been trying to get an LSTM (LSTM followed by a linear layer in a custom model), working in Pytorch, but was getting the following error when calculating the loss:

Assertion cur_target >= 0 && cur_target < n_classes' failed.

我用以下方法定义了损失函数:

I defined the loss function with:

criterion = nn.CrossEntropyLoss()

然后用

loss += criterion(output, target)

我给目标的尺寸为[sequence_length,number_of_classes],输出的尺寸为[sequence_length,1,number_of_classes].

I was giving the target with dimensions [sequence_length, number_of_classes], and output has dimensions [sequence_length, 1, number_of_classes].

我遵循的示例似乎在做相同的事情,但是在

The examples I was following seemed to be doing the same thing, but it was different on the Pytorch docs on cross entropy loss.

文档说目标应该是维度(N),其中每个值是0≤target [i]≤C-1,C是类的数量.我将目标更改为这种形式,但是现在我得到一个错误提示(序列长度为75,并且有55个类):

The docs say the target should be of dimension (N), where each value is 0 ≤ targets[i] ≤ C−1 and C is the number of classes. I changed the target to be in that form, but now I'm getting an error saying (The sequence length is 75, and there are 55 classes):

Expected target size (75, 55), got torch.Size([75])

我尝试着寻找两种错误的解决方案,但仍然无法正常工作.我对目标的正确尺寸以及第一个错误的实际含义感到困惑(不同的搜索给出的错误含义非常不同,没有一个修复程序起作用.)

I've tried looking at solutions for both errors, but still can't get this working properly. I'm confused as to the proper dimensions of target, as well as the actual meaning behind the first error (different searches gave very different meanings for the error, none of the fixes worked).

谢谢

推荐答案

您可以在output张量上使用squeeze(),这将返回一个已删除尺寸为1的所有尺寸的张量.

You can use squeeze() on your output tensor, this returns a tensor with all the dimensions of size 1 removed.

此短代码使用您在问题中提到的形状:

This short code uses the shapes you mentioned in your question:

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

导致您描述的错误:

ValueError: Expected target size (75, 55), got torch.Size([75])

因此在output张量上使用squeeze()可以通过使其形状正确来解决您的问题.

So using squeeze() on your output tensor solves your problem by getting it to correct shape.

形状正确的示例:

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()

# apply squeeze() on output tensor to change shape form [75, 1, 55] to [75, 55]
loss = criterion(output.squeeze(), target)
print(loss)

输出:

tensor(4.0442)

使用squeeze()将张量形状从[75, 1, 55]更改为[75, 55],以使输出形状和目标形状匹配!

Using squeeze() changes your tensor shape from [75, 1, 55] to [75, 55] so it that output and target shape matches!

您还可以使用其他方法来重塑张量,这很重要,您必须使用[sequence_length, number_of_classes]而不是[sequence_length, 1, number_of_classes]的形状.

You can also use other methods to reshape your tensor, it is just important that you have the shape of [sequence_length, number_of_classes] instead of [sequence_length, 1, number_of_classes].

您的目标应该是LongTensor.一个包含类的torch.long类型的张量.形状是[sequence_length].

Your targets should be a LongTensor resp. a tensor of type torch.long containing the classes. Shape here is [sequence_length].


传递给交叉熵函数时,上述示例的形状:

输出:torch.Size([75, 55])
目标:torch.Size([75])

Outputs: torch.Size([75, 55])
Targets: torch.Size([75])

这是一个更通用的示例,CE的输出和目标应为什么样.在这种情况下,我们假设有5种不同的目标类别,对于长度为1、2和3的序列有3个示例.

Here is a more general example what outputs and targets should look like for CE. In this case we assume we have 5 different target classes, there are three examples for sequences of length 1, 2 and 3:

# init CE Loss function
criterion = nn.CrossEntropyLoss()

# sequence of length 1
output = torch.rand(1, 5)
# in this case the 1th class is our target, index of 1th class is 0
target = torch.LongTensor([0])
loss = criterion(output, target)
print('Sequence of length 1:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 2
output = torch.rand(2, 5)
# targets are here 1th class for the first element and 2th class for the second element
target = torch.LongTensor([0, 1])
loss = criterion(output, target)
print('\nSequence of length 2:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 3
output = torch.rand(3, 5)
# targets here 1th class, 2th class and 2th class again for the last element of the sequence
target = torch.LongTensor([0, 1, 1])
loss = criterion(output, target)
print('\nSequence of length 3:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

输出:

Sequence of length 1:
Output: tensor([[ 0.1956,  0.0395,  0.6564,  0.4000,  0.2875]]) shape: torch.Size([1, 5])
Target: tensor([ 0]) shape: torch.Size([1])
Loss: tensor(1.7516)

Sequence of length 2:
Output: tensor([[ 0.9905,  0.2267,  0.7583,  0.4865,  0.3220],
        [ 0.8073,  0.1803,  0.5290,  0.3179,  0.2746]]) shape: torch.Size([2, 5])
Target: tensor([ 0,  1]) shape: torch.Size([2])
Loss: tensor(1.5469)

Sequence of length 3:
Output: tensor([[ 0.8497,  0.2728,  0.3329,  0.2278,  0.1459],
        [ 0.4899,  0.2487,  0.4730,  0.9970,  0.1350],
        [ 0.0869,  0.9306,  0.1526,  0.2206,  0.6328]]) shape: torch.Size([3, 5])
Target: tensor([ 0,  1,  1]) shape: torch.Size([3])
Loss: tensor(1.3918)

我希望这会有所帮助!

这篇关于Pytorch LSTM:计算交叉熵损失时的目​​标尺寸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆