caffe像素分类/回归 [英] caffe pixel-wise classification / regression
问题描述
我想做的是做一个简单的像素级分类或回归任务。因此,我有一个输入图像和一个ground_truth。我想做的是做一个简单的分割任务,其中有一个圆形和一个矩形。我想训练圆形或矩形的位置。这意味着我有一个ground_truth图像,该图像在圆所在的所有位置都为 1,在矩形所在的所有位置都为 2。然后,我将我的图像和ground_truth图像以.png图像的形式输入。
What I want to do is to do a simple pixel-wise classification or regression task. Therefore I have an input image and a ground_truth. What I want to do is to do an easy segmentation task where I have a circle and a rectangle. And I want to train, where the circle or where the rectangle is. That means I have an ground_truth images which has value "1" at all the locations where the circle is and value "2" at all the locations where the rectangle is. Then I have my images and ground_truth images as input in form of .png images.
然后,我认为我可以根据损失层来执行回归或分类任务:我一直在使用 fcn alexnet
Then I think I can either to a regression or classification task depending on my loss layer: I have been using the fully convolutional AlexNet from fcn alexnet
分类:
layer {
name: "upscore"
type: "Deconvolution"
bottom: "score_fr"
top: "upscore"
param {
lr_mult: 0
}
convolution_param {
num_output: 3 ## <<---- 0 = backgrund 1 = circle 2 = rectangle
bias_term: false
kernel_size: 63
stride: 32
}
}
layer {
name: "score"
type: "Crop"
bottom: "upscore"
bottom: "data"
top: "score"
crop_param {
axis: 2
offset: 18
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss" ## <<----
bottom: "score"
bottom: "ground_truth"
top: "loss"
loss_param {
ignore_label: 0
}
}
回归:
layer {
name: "upscore"
type: "Deconvolution"
bottom: "score_fr"
top: "upscore"
param {
lr_mult: 0
}
convolution_param {
num_output: 1 ## <<---- 1 x height x width
bias_term: false
kernel_size: 63
stride: 32
}
}
layer {
name: "score"
type: "Crop"
bottom: "upscore"
bottom: "data"
top: "score"
crop_param {
axis: 2
offset: 18
}
}
layer {
name: "loss"
type: "EuclideanLoss" ## <<----
bottom: "score"
bottom: "ground_truth"
top: "loss"
}
但是,这甚至都不会产生我想要的结果有。我认为我对像素级分类/回归的理解存在问题。你能告诉我我的错误在哪里吗?
However, this produces not even the results I want to have. I think there is something wrong with my understanding of pixel-wise classification / regression. Could you tell me where my mistake is?
编辑1
对于回归,输出的检索看起来像这样:
For regression the retrieval of the output would look like this:
output_blob = pred['result'].data
predicated_image_array = np.array(output_blob)
predicated_image_array = predicated_image_array.squeeze()
print predicated_image_array.shape
#print predicated_image_array.shape
#print mean_array
range_value = np.ptp(predicated_image_array)
min_value = predicated_image_array.min()
max_value = predicated_image_array.max()
# make positive
predicated_image_array[:] -= min_value
if not range_value == 0:
predicated_image_array /= range_value
predicated_image_array *= 255
predicated_image_array = predicated_image_array.astype(np.int64)
print predicated_image_array.shape
cv2.imwrite('predicted_output.jpg', predicated_image_array)
这很容易,因为输出是1 x高度x宽度,其值为实际输出值。但是,由于输出为3(数字标签)x高度x宽度,因此如何检索分类/ SotMaxLayer的输出。但是我不知道这种形状的内容的含义。
This is easy since the output is 1 x height x width and the values are the actual output values. But how would one retrieve the output for classification / SotMaxLayer since the output is 3 (num labels) x height x width. But I do not know the meaning of the content of this shape.
推荐答案
首先,您的问题不是回归
,但分类
!
first of all, your problem is not regression
, but classification
!
如果您想教网络识别圆形和矩形,您必须创建不同的数据集-图像和标签,例如: circle-0和矩形-1
。您可以通过制作包含图像路径和图像标签的文本文件来做到这一点,例如: /path/circle1.png 0 /path/circle2.png 0 /path/rectangle1.png 1 / path / rectangle1 .png 1
。这是一个很好的教程,用于解决您这样的问题。祝你好运。
if you want to teach the net recognise circles and rectangles you have to make a different data set - an images and labels, for example: circle - 0 and rectangle - 1
. you do it by making text file that containsthe images path and the images labels, for example: /path/circle1.png 0 /path/circle2.png 0 /path/rectangle1.png 1 /path/rectangle1.png 1
. here is a nice tutorial for a problem like yours. good luck.
这篇关于caffe像素分类/回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!