Caffe HDF5按像素分类 [英] Caffe HDF5 pixel-wise classification
问题描述
我正在尝试使用caffe为图像实现像素级二进制分类。对于尺寸为3x256x256的每个图像,我都有一个256x256的标签数组,其中每个条目都标记为0或1。此外,当我使用以下代码读取HDF5文件时,
I am trying to implement a pixel-wise binary classification for images using caffe. For each image having dimension 3x256x256, I have a 256x256 label array in which each entry is marked as either 0 or 1. Also, when I read my HDF5 file using the below code,
dirname = "examples/hdf5_classification/data"
f = h5py.File(os.path.join(dirname, 'train.h5'), "r")
ks = f.keys()
data = np.array(f[ks[0]])
label = np.array(f[ks[1]])
print "Data dimension from HDF5", np.shape(data)
print "Label dimension from HDF5", np.shape(label)
我得到的数据和标签尺寸为
I get the data and label dimension as
Data dimension from HDF5 (402, 3, 256, 256)
Label dimension from HDF5 (402, 256, 256)
我正在尝试将此数据输入给定的hdf5分类网络,并且在训练时,我有以下输出(使用默认求解器,但处于GPU模式)。
I am trying to feed this data into the given hdf5 classification network and while training, I have the following output(using the default solver, but in GPU mode).
!cd /home/unni/MTPMain/caffe-master/ && ./build/tools/caffe train -solver examples/hdf5_classification/solver.prototxt
给予
I1119 01:29:02.222512 11910 caffe.cpp:184] Using GPUs 0
I1119 01:29:02.509752 11910 solver.cpp:47] Initializing solver from parameters:
train_net: "examples/hdf5_classification/train_val.prototxt"
test_net: "examples/hdf5_classification/train_val.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "examples/hdf5_classification/data/train"
solver_mode: GPU
device_id: 0
I1119 01:29:02.519805 11910 solver.cpp:80] Creating training net from train_net file: examples/hdf5_classification/train_val.prototxt
I1119 01:29:02.520031 11910 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I1119 01:29:02.520053 11910 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I1119 01:29:02.520104 11910 net.cpp:49] Initializing net from parameters:
name: "LogisticRegressionNet"
state {
phase: TRAIN
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "examples/hdf5_classification/data/train.txt"
batch_size: 10
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "data"
top: "fc1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc1"
bottom: "label"
top: "loss"
}
I1119 01:29:02.520256 11910 layer_factory.hpp:76] Creating layer data
I1119 01:29:02.520277 11910 net.cpp:106] Creating Layer data
I1119 01:29:02.520290 11910 net.cpp:411] data -> data
I1119 01:29:02.520331 11910 net.cpp:411] data -> label
I1119 01:29:02.520352 11910 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt
I1119 01:29:02.529341 11910 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I1119 01:29:02.542645 11910 hdf5.cpp:32] Datatype class: H5T_FLOAT
I1119 01:29:10.601307 11910 net.cpp:150] Setting up data
I1119 01:29:10.612926 11910 net.cpp:157] Top shape: 10 3 256 256 (1966080)
I1119 01:29:10.612963 11910 net.cpp:157] Top shape: 10 256 256 (655360)
I1119 01:29:10.612969 11910 net.cpp:165] Memory required for data: 10485760
I1119 01:29:10.612983 11910 layer_factory.hpp:76] Creating layer fc1
I1119 01:29:10.624948 11910 net.cpp:106] Creating Layer fc1
I1119 01:29:10.625015 11910 net.cpp:454] fc1 <- data
I1119 01:29:10.625039 11910 net.cpp:411] fc1 -> fc1
I1119 01:29:10.645814 11910 net.cpp:150] Setting up fc1
I1119 01:29:10.645864 11910 net.cpp:157] Top shape: 10 2 (20)
I1119 01:29:10.645875 11910 net.cpp:165] Memory required for data: 10485840
I1119 01:29:10.645912 11910 layer_factory.hpp:76] Creating layer loss
I1119 01:29:10.657094 11910 net.cpp:106] Creating Layer loss
I1119 01:29:10.657133 11910 net.cpp:454] loss <- fc1
I1119 01:29:10.657147 11910 net.cpp:454] loss <- label
I1119 01:29:10.657163 11910 net.cpp:411] loss -> loss
I1119 01:29:10.657189 11910 layer_factory.hpp:76] Creating layer loss
F1119 01:29:14.883095 11910 softmax_loss_layer.cpp:42] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (10 vs. 655360) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.
*** Check failure stack trace: ***
@ 0x7f0652e1adaa (unknown)
@ 0x7f0652e1ace4 (unknown)
@ 0x7f0652e1a6e6 (unknown)
@ 0x7f0652e1d687 (unknown)
@ 0x7f0653494219 caffe::SoftmaxWithLossLayer<>::Reshape()
@ 0x7f065353f50f caffe::Net<>::Init()
@ 0x7f0653541f05 caffe::Net<>::Net()
@ 0x7f06535776cf caffe::Solver<>::InitTrainNet()
@ 0x7f0653577beb caffe::Solver<>::Init()
@ 0x7f0653578007 caffe::Solver<>::Solver()
@ 0x7f06535278b3 caffe::Creator_SGDSolver<>()
@ 0x410831 caffe::SolverRegistry<>::CreateSolver()
@ 0x40a16b train()
@ 0x406908 main
@ 0x7f065232cec5 (unknown)
@ 0x406e28 (unknown)
@ (nil) (unknown)
Aborted
Bas通常,错误是
softmax_loss_layer.cpp:42] Check failed:
outer_num_ * inner_num_ == bottom[1]->count() (10 vs. 655360)
Number of labels must match number of predictions;
e.g., if softmax axis == 1 and prediction shape is (N, C, H, W),
label count (number of labels) must be N*H*W,
with integer values in {0, 1, ..., C-1}.
我不明白为什么期望的标签数量与我的批量大小相同。我应该如何解决这个问题?我的标记方法有问题吗?
I am not able to understand why the number of labels expected is just same as my batch size. How exactly should I tackle this problem ? Is this a problem with my labeling method ?
推荐答案
您的问题是 SoftmaxWithLoss
层试图比较每个输入图像2个元素的预测向量,到每个图像256×256大小的标签。
这没有道理。
Your problem is that "SoftmaxWithLoss"
layer tries to compare a prediction vector of 2 elements per input image to a label of size 256-by-256 per image.
This makes no sense.
错误的根本原因:我想您很累的是将二进制分类器应用于图像中的每个像素。为此,您将 fc1定义为具有 num_output = 2
的 InnerProduct
层。但是,caffe看到这种情况的方式是,您将单个二进制分类器应用于了 entire 图像。因此,caffe为您提供整个图像的单个二进制预测。
Root cause of the error: I guess what you tired to do is to have a binary classifier applied to each pixel in the image. To that end you defined "fc1" as an "InnerProduct"
layer with num_output=2
. However, the way caffe sees this is that you have a single binary classifier applied to the entire image. Thus caffe gives you a single binary prediction to the entire image.
如何解决::在进行像素级预测时,您不再需要使用 InnerProduct code>层,您就有一个完全卷积网。如果您用转换层替换 fc1(例如检查每个像素的5×5环境并根据该补丁做出决定的内核):
How to solve: when working on pixel-wise predictions you no longer need to use "InnerProduct"
layers and you have a "fully convolutional net". If you replace "fc1" with a conv layer (for instance a kernel that examine the 5-by-5 environment of each pixel and makes a decision according to this patch):
layer {
name: "bin_class"
type: "Convolution"
bottom: "data"
top: "bin_class"
convolution_param {
num_output: 2 # binary class output
kernel_size: 5 # 5-by-5 patch for prediciton
pad: 2 # make sure spatial output size equals size of label
}
}
现在应用 SoftmaxWithLoss
到底部:bin_class
和底部:标签
应该起作用。
Now applying "SoftmaxWithLoss"
to bottom: bin_class
and bottom: label
should work.
这篇关于Caffe HDF5按像素分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!