PyTorch 中的交叉熵 [英] Cross Entropy in PyTorch
问题描述
我对 PyTorch 中的交叉熵损失有点困惑.
I'm a bit confused by the cross entropy loss in PyTorch.
考虑这个例子:
import torch
import torch.nn as nn
from torch.autograd import Variable
output = Variable(torch.FloatTensor([0,0,0,1])).view(1, -1)
target = Variable(torch.LongTensor([3]))
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)
我希望损失为 0.但我得到:
I would expect the loss to be 0. But I get:
Variable containing:
0.7437
[torch.FloatTensor of size 1]
据我所知,交叉熵可以这样计算:
As far as I know cross entropy can be calculated like this:
但结果不应该是 1*log(1) = 0 吗?
But shouldn't be the result then 1*log(1) = 0 ?
我尝试了不同的输入,比如 one-hot 编码,但这根本不起作用,所以看起来损失函数的输入形状没问题.
I tried different inputs like one-hot encodings, but this doesn't work at all, so it seems the input shape of the loss function is okay.
如果有人能帮助我并告诉我我的错误在哪里,我将不胜感激.
I would be really grateful if someone could help me out and tell me where my mistake is.
提前致谢!
推荐答案
在您的示例中,您将输出 [0, 0, 0, 1]
视为交叉的数学定义所要求的概率熵.但是 PyTorch 将它们视为输出,不需要和 1
相加,需要先转换为它使用 softmax 函数的概率.
In your example you are treating output [0, 0, 0, 1]
as probabilities as required by the mathematical definition of cross entropy. But PyTorch treats them as outputs, that don’t need to sum to 1
, and need to be first converted into probabilities for which it uses the softmax function.
所以 H(p, q)
变成:
H(p, softmax(output))
将输出 [0, 0, 0, 1]
转换为概率:
Translating the output [0, 0, 0, 1]
into probabilities:
softmax([0, 0, 0, 1]) = [0.1749, 0.1749, 0.1749, 0.4754]
哪里:
-log(0.4754) = 0.7437
这篇关于PyTorch 中的交叉熵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!