如何从图像目录中为 siamese 网络创建 CaffeDB 训练数据 [英] How to Create CaffeDB training data for siamese networks out of image directory
问题描述
我需要一些帮助,从带有图像和标签文本文件的普通目录中为 siamese CNN 创建 CaffeDB.最好是用 python 的方式来做.
问题不是遍历目录并制作成对的图像.我的问题更多是从这些对中制作 CaffeDB.
到目前为止,我只使用 convert_imageset
从图像目录中创建 CaffeDB.
感谢帮助!
I need some help to create a CaffeDB for siamese CNN out of a plain directory with images and label-text-file. Best would be a python-way to do it.
The problem is not to walk through the directory and making pairs of images. My problem is more of making a CaffeDB out of those pairs.
So far I only used convert_imageset
to create a CaffeDB out of an image directory.
Thanks for help!
推荐答案
为什么不简单地使用旧的 convert_imagest
制作两个数据集?
Why don't you simply make two datasets using good old convert_imagest
?
layer {
name: "data_a"
top: "data_a"
top: "label_a"
type: "Data"
data_param { source: "/path/to/first/data_lmdb" }
...
}
layer {
name: "data_b"
top: "data_b"
top: "label_b"
type: "Data"
data_param { source: "/path/to/second/data_lmdb" }
...
}
至于损失,由于每个示例都有一个类标签,您需要将 label_a
和 label_b
转换为 same_not_same_label
.我建议您使用 python 层即时"执行此操作.在prototxt
中添加对python层的调用:
As for the loss, since every example has a class label you need to convert label_a
and label_b
into a same_not_same_label
. I suggest you do this "on-the-fly" using a python layer. In the prototxt
add the call to python layer:
layer {
name: "a_b_to_same_not_same_label"
type: "Python"
bottom: "label_a"
bottom: "label_b"
top: "same_not_same_label"
python_param {
# the module name -- usually the filename -- that needs to be in $PYTHONPATH
module: "siamese"
# the layer name -- the class name in the module
layer: "SiameseLabels"
}
propagate_down: false
}
创建 siamese.py
(确保它在你的 $PYTHONPATH
中).在 siamese.py
你应该有图层类:
Create siamese.py
(make sure it is in your $PYTHONPATH
). In siamese.py
you should have the layer class:
import sys, os
sys.path.insert(0,os.environ['CAFFE_ROOT'] + '/python')
import caffe
class SiameseLabels(caffe.Layer):
def setup(self, bottom, top):
if len(bottom) != 2:
raise Exception('must have exactly two inputs')
if len(top) != 1:
raise Exception('must have exactly one output')
def reshape(self,bottom,top):
top[0].reshape( *bottom[0].shape )
def forward(self,bottom,top):
top[0].data[...] = (bottom[0].data == bottom[1].data).astype('f4')
def backward(self,top,propagate_down,bottom):
# no back prop
pass
确保以不同的方式对两组中的示例进行混洗,以便获得非平凡的对.此外,如果您使用不同数量的示例构建第一个和第二个数据集,那么您将在每个时期看到不同的对 ;)
Make sure you shuffle the examples in the two sets in a different manner, so you get non-trivial pairs. Moreover, if you construct the first and second data sets with different number of examples, then you will see different pairs at each epoch ;)
确保您构建的网络共享重复层的权重,请参阅本教程 了解更多信息.
Make sure you construct the network to share the weights of the duplicated layers, see this tutorial for more information.
这篇关于如何从图像目录中为 siamese 网络创建 CaffeDB 训练数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!