Caffe sigmoid 交叉熵损失 [英] Caffe sigmoid cross entropy loss
问题描述
我将 sigmoid 交叉熵损失函数用于多标签分类问题,如 本教程.然而,在他们的教程结果和我的结果中,输出预测都在 (-Inf, Inf)
范围内,而 sigmoid 的范围是 [0, 1]代码>.sigmoid 是否只在 backprop 中处理?也就是说,前向传递不应该压缩输出吗?
I am using the sigmoid cross entropy loss function for a multilabel classification problem as laid out by this tutorial. However, in both their results on the tutorial and my results, the output predictions are in the range (-Inf, Inf)
, while the range of a sigmoid is [0, 1]
. Is the sigmoid only processed in the backprop? That is, shouldn't a forward pass squash the output?
推荐答案
在这个例子中,"SigmoidCrossEntropyLoss"
层的输入是一个全连接层的输出.实际上,对 "InnerProduct"
层的输出值没有限制,它们可以在 [-inf, inf]
范围内.
但是,如果您仔细检查 "SigmoidCrossEntropyLoss"
,您会注意到它包含一个 "Sigmoid"
层内 -- 确保稳定的梯度估计.
因此,在测试时,您应该将 "SigmoidCrossEntropyLoss"
替换为一个简单的 "Sigmoid"
层以输出每个类的预测.
In this example the input to the "SigmoidCrossEntropyLoss"
layer is the output of a fully-connect layer. Indeed there are no constraints on the values of the outputs of an "InnerProduct"
layer and they can be in range [-inf, inf]
.
However, if you examine carefully the "SigmoidCrossEntropyLoss"
you'll notice that it includes a "Sigmoid"
layer inside -- to ensure stable gradient estimation.
Therefore, at test time, you should replace the "SigmoidCrossEntropyLoss"
with a simple "Sigmoid"
layer to output per-class predictions.
这篇关于Caffe sigmoid 交叉熵损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!