HDF5中的Caffe分类标签 [英] Caffe classification labels in HDF5

查看:80
本文介绍了HDF5中的Caffe分类标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在微调网络.在特定情况下,我想将其用于回归分析,这是可行的.在另一种情况下,我想将其用于分类.

I am finetuning a network. In a specific case I want to use it for regression, which works. In another case, I want to use it for classification.

对于这两种情况,我都有一个带有标签的HDF5文件.通过回归,这只是一个包含浮点数的1×1 numpy数组.我以为在将EuclideanLoss图层更改为SoftmaxLoss之后,可以使用相同的标签进行分类.但是,这样我得到了负损失:

For both cases I have an HDF5 file, with a label. With regression, this is just a 1-by-1 numpy array that contains a float. I thought I could use the same label for classification, after changing my EuclideanLoss layer to SoftmaxLoss. However, then I get a negative loss as so:

    Iteration 19200, loss = -118232
    Train net output #0: loss = 39.3188 (* 1 = 39.3188 loss)

您能解释一下,如果怎么了,怎么了?我确实看到训练损失约为40(这仍然很糟糕),但是网络仍在训练吗?负损失不断增加,越来越负.

Can you explain if, and so what, goes wrong? I do see that the training loss is about 40 (which is still terrible), but does the network still train? The negative loss just keeps on getting more negative.

更新
阅读 Shai的评论

UPDATE
After reading Shai's comment and answer, I have made the following changes:
- I made the num_output of my last fully connected layer 6, as I have 6 labels (used to be 1).
- I now create a one-hot vector and pass that as a label into my HDF5 dataset as follows

    f['label'] = numpy.array([1, 0, 0, 0, 0, 0])        

现在尝试运行网络会返回

Trying to run my network now returns

   Check failed: hdf_blobs_[i]->shape(0) == num (6 vs. 1)       

经过在线研究后,我将向量重塑为1x6向量.这导致以下错误:

After some research online, I reshaped the vector to a 1x6 vector. This lead to the following error:

  Check failed: outer_num_ * inner_num_ == bottom[1]->count() (40 vs. 240) 
   Number of labels must match number of predictions; e.g., if softmax axis == 1 
   and prediction shape is (N, C, H, W), label count (number of labels) 
   must be N*H*W, with integer values in {0, 1, ..., C-1}.

我的想法是为每个数据集(图像)添加1个标签,并在我的train.prototxt文件中创建批次.这不应该创建正确的批量大小吗?

My idea is to add 1 label per data set (image) and in my train.prototxt I create batches. Shouldn't this create the correct batch size?

推荐答案

从回归回归分类以来,您无需输出要与"label"进行比较的标量,而是需要输出概率 vector 长度num-labels与离散类"label"进行比较.您需要将"SoftmaxWithLoss"之前的图层的num_output参数从1更改为num-labels.

Since you moved from regression to classification, you need to output not a scalar to compare with "label" but rather a probability vector of length num-labels to compare with the discrete class "label". You need to change num_output parameter of the layer before "SoftmaxWithLoss" from 1 to num-labels.

我相信当前您正在访问未初始化的内存,在这种情况下,我希望caffe早晚崩溃.

I believe currently you are accessing un-initialized memory and I would expect caffe to crash sooner or later in this case.

更新:
您进行了两项更改:num_output 1-> 6,并且还将输入label从标量更改为矢量.
第一个更改是使用"SoftmaxWithLossLayer"所需的唯一更改.
请勿将label从标量更改为热向量".

Update:
You made two changes: num_output 1-->6, and you also changed your input label from a scalar to vector.
The first change was the only one you needed for using "SoftmaxWithLossLayer".
Do not change label from a scalar to a "hot-vector".

为什么?
因为"SoftmaxWithLoss"基本上会查看您输出的6矢量预测,所以将真实情况label解释为 index 并查看-log(p[label]):p[label]越接近1(即,您对预期类别的预测可能性很高),则损失越小.如果将预测p[label]接近于零(即,您错误地预测了预期类别的可能性很小),则损失会快速增长.

Why?
Because "SoftmaxWithLoss" basically looks at the 6-vector prediction you output, interpret the ground-truth label as index and looks at -log(p[label]): the closer p[label] is to 1 (i.e., you predicted high probability for the expected class) the lower the loss. Making a prediction p[label] close to zero (i.e., you incorrectly predicted low probability for the expected class) then the loss grows fast.

使用"hot-vector"作为地面真实输入label,可能会导致多类别分类(似乎不是您要在此处解决的任务).您可能会发现此SO线程与特定情况有关.

Using a "hot-vector" as ground-truth input label, may give rise to multi-category classification (does not seems like the task you are trying to solve here). You may find this SO thread relevant to that particular case.

这篇关于HDF5中的Caffe分类标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆