Caffe sigmoid 交叉熵损失 [英] Caffe sigmoid cross entropy loss

查看:45
本文介绍了Caffe sigmoid 交叉熵损失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将 sigmoid 交叉熵损失函数用于多标签分类问题,如 本教程.然而,在他们的教程结果和我的结果中,输出预测都在 (-Inf, Inf) 范围内,而 sigmoid 的范围是 [0, 1].sigmoid 是否只在 backprop 中处理?也就是说,前向传递不应该压缩输出吗?

I am using the sigmoid cross entropy loss function for a multilabel classification problem as laid out by this tutorial. However, in both their results on the tutorial and my results, the output predictions are in the range (-Inf, Inf), while the range of a sigmoid is [0, 1]. Is the sigmoid only processed in the backprop? That is, shouldn't a forward pass squash the output?

推荐答案

在这个例子中,"SigmoidCrossEntropyLoss" 层的输入是一个全连接层的输出.实际上,对 "InnerProduct" 层的输出值没有限制,它们可以在 [-inf, inf] 范围内.
但是,如果您仔细检查 "SigmoidCrossEntropyLoss",您会注意到它包含一个 "Sigmoid" 层内 -- 确保稳定的梯度估计.
因此,在测试时,您应该将 "SigmoidCrossEntropyLoss" 替换为一个简单的 "Sigmoid" 层以输出每个类的预测.

In this example the input to the "SigmoidCrossEntropyLoss" layer is the output of a fully-connect layer. Indeed there are no constraints on the values of the outputs of an "InnerProduct" layer and they can be in range [-inf, inf].
However, if you examine carefully the "SigmoidCrossEntropyLoss" you'll notice that it includes a "Sigmoid" layer inside -- to ensure stable gradient estimation.
Therefore, at test time, you should replace the "SigmoidCrossEntropyLoss" with a simple "Sigmoid" layer to output per-class predictions.

这篇关于Caffe sigmoid 交叉熵损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆