需要帮助来了解SigmoidCrossEntropyLossLayer的Caffe代码，以解决多标签丢失问题 [英] Need help understanding the Caffe code for SigmoidCrossEntropyLossLayer for multi-label loss

查看：102 发布时间：2020/5/4 3:21:10 deep-learning caffe logistic-regression cross-entropy

本文介绍了需要帮助来了解SigmoidCrossEntropyLossLayer的Caffe代码，以解决多标签丢失问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在了解Caffe函数 SigmoidCrossEntropyLossLayer 时，我需要帮助，这是逻辑激活的交叉熵错误.

I need help in understanding the Caffe function, SigmoidCrossEntropyLossLayer, which is the cross-entropy error with logistic activation.

基本上，具有N个独立目标的单个示例的交叉熵误差表示为:

Basically, the cross-entropy error for a single example with N independent targets is denoted as:

 - sum-over-N( t[i] * log(x[i]) + (1 - t[i]) * log(1 - x[i] )

其中，t是目标，为0或1，而x为输出，由i索引. x当然需要进行物流激活.

where t is the target, 0 or 1, and x is the output, indexed by i. x, of course goes through a logistic activation.

更快进行交叉熵计算的代数技巧将计算减少为:

An algebraic trick for quicker cross-entropy calculation reduces the computation to:

 -t[i] * x[i] + log(1 + exp(x[i]))

，您可以在第3部分此处进行验证.

and you can verify that from Section 3 here.

问题是，以上内容如何转换为以下损失计算代码:

The question is, how is the above translated to the loss calculating code below:

   loss -= input_data[i] * (target[i] - (input_data[i] >= 0)) -
        log(1 + exp(input_data[i] - 2 * input_data[i] * (input_data[i] >= 0)));

谢谢.

为方便起见，以下复制了该功能.

The function is reproduced below for convenience.

   template <typename Dtype>
    void SigmoidCrossEntropyLossLayer<Dtype>::Forward_cpu(
        const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {
      // The forward pass computes the sigmoid outputs.                                                                                                                                                                                    
      sigmoid_bottom_vec_[0] = bottom[0];
      sigmoid_layer_->Forward(sigmoid_bottom_vec_, sigmoid_top_vec_);
      // Compute the loss (negative log likelihood)                                                                                                                                                                                        
      // Stable version of loss computation from input data                                                                                                                                                                                
      const Dtype* input_data = bottom[0]->cpu_data();
      const Dtype* target = bottom[1]->cpu_data();
      int valid_count = 0;
      Dtype loss = 0;
      for (int i = 0; i < bottom[0]->count(); ++i) {
        const int target_value = static_cast<int>(target[i]);
        if (has_ignore_label_ && target_value == ignore_label_) {
          continue;
        }
        loss -= input_data[i] * (target[i] - (input_data[i] >= 0)) -
            log(1 + exp(input_data[i] - 2 * input_data[i] * (input_data[i] >= 0)));
        ++valid_count;
      }
      normalizer_ = get_normalizer(normalization_, valid_count);
      top[0]->mutable_cpu_data()[0] = loss / normalizer_;
    }

推荐答案

在表达式log(1 + exp(x[i]))中，如果x[i]很大，您可能会遇到数值不稳定性.为了克服这种数值不稳定性，可以像这样对S型函数进行缩放:

In the expression log(1 + exp(x[i])) you might encounter numerical instability in case x[i] is very large. To overcome this numerical instability, one scales the sigmoid function like this:

 sig(x) = exp(x)/(1+exp(x)) 
        = [exp(x)*exp(-x(x>=0))]/[(1+exp(x))*exp(-x(x>=0))]

现在，如果将sig(x)的新的稳定表达式插入损失中，最终将得到与caffe使用的表达式相同的表达式.

Now, if you plug the new and stable expression for sig(x) into the loss you'll end up with the same expression as caffe is using.

享受！

这篇关于需要帮助来了解SigmoidCrossEntropyLossLayer的Caffe代码，以解决多标签丢失问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

需要帮助来了解SigmoidCrossEntropyLossLayer的Caffe代码，以解决多标签丢失问题 [英] Need help understanding the Caffe code for SigmoidCrossEntropyLossLayer for multi-label loss

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

需要帮助来了解SigmoidCrossEntropyLossLayer的Caffe代码，以解决多标签丢失问题 [英] Need help understanding the Caffe code for SigmoidCrossEntropyLossLayer for multi-label loss

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭