PyTorch 中的交叉熵 [英] Cross Entropy in PyTorch

查看:29
本文介绍了PyTorch 中的交叉熵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 PyTorch 中的交叉熵损失有点困惑.

I'm a bit confused by the cross entropy loss in PyTorch.

考虑这个例子:

import torch
import torch.nn as nn
from torch.autograd import Variable

output = Variable(torch.FloatTensor([0,0,0,1])).view(1, -1)
target = Variable(torch.LongTensor([3]))

criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

我希望损失为 0.但我得到:

I would expect the loss to be 0. But I get:

Variable containing:
 0.7437
[torch.FloatTensor of size 1]

据我所知,交叉熵可以这样计算:

As far as I know cross entropy can be calculated like this:

但结果不应该是 1*log(1) = 0 吗?

But shouldn't be the result then 1*log(1) = 0 ?

我尝试了不同的输入,比如 one-hot 编码,但这根本不起作用,所以看起来损失函数的输入形状没问题.

I tried different inputs like one-hot encodings, but this doesn't work at all, so it seems the input shape of the loss function is okay.

如果有人能帮助我并告诉我我的错误在哪里,我将不胜感激.

I would be really grateful if someone could help me out and tell me where my mistake is.

提前致谢!

推荐答案

在您的示例中,您将输出 [0, 0, 0, 1] 视为交叉的数学定义所要求的概率熵.但是 PyTorch 将它们视为输出,不需要和 1 相加,需要先转换为它使用 softmax 函数的概率.

In your example you are treating output [0, 0, 0, 1] as probabilities as required by the mathematical definition of cross entropy. But PyTorch treats them as outputs, that don’t need to sum to 1, and need to be first converted into probabilities for which it uses the softmax function.

所以 H(p, q) 变成:

H(p, softmax(output))

将输出 [0, 0, 0, 1] 转换为概率:

Translating the output [0, 0, 0, 1] into probabilities:

softmax([0, 0, 0, 1]) = [0.1749, 0.1749, 0.1749, 0.4754]

哪里:

-log(0.4754) = 0.7437

这篇关于PyTorch 中的交叉熵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆