如何使用caffe convnet库来检测面部表情? [英] How to use caffe convnet library to detect facial expressions?
问题描述
如何使用caffe convnet检测面部表情?
How can I use caffe convnet to detect facial expressions?
我有一个图像数据集,Cohn Kanade,我想用这个数据集训练caffe convnet。 Caffe 有一个文档站点,但它没有解释如何训练我自己的数据。只需预先训练好的数据。
I have a image dataset, Cohn Kanade, and I want to train caffe convnet with this dataset. Caffe has a documentation site, but its not explain how to train my own data. Just with pre trained data.
有人可以教我怎么做吗?
Can someone teach me how to do it?
推荐答案
Caffe支持输入数据的多种格式(HDF5 / lmdb / leveldb)。这只是挑选一个你觉得最舒服的问题。以下是几个选项:
Caffe supports multiple formats for the input data (HDF5/lmdb/leveldb). It's just a matter of picking the one you feel most comfortable with. Here are a couple of options:
- caffe / build / tools / convert_imageset:
- caffe/build/tools/convert_imageset:
convert_imageset
是您从构建caffe获得的命令行工具之一。
convert_imageset
is one of the command line tools you get from building caffe.
用法如下:
- 指定文本文件中的图像和标签对列表。每对1行。
- 指定图像的位置。
- 选择后端数据库(格式)。默认值是lmdb,应该没问题。
你需要写一个文本文件,其中每一行都以图片的文件名开头通过标量标签(例如0,1,2,...)
You need to write up a text file where each line starts with the filename of the image followed by a scalar label (e.g. 0, 1, 2,...)
- 构建您的lmdb python使用Caffe的
Datum
类:
这需要构建caffe的python接口。在这里你写一些python代码:
This requires building caffe's python interface. Here you write some python code that:
- 遍历图像列表
- 加载图像进入
numpy
数组。 - 构造一个caffe
Datum
对象 - 将图像数据分配给
Datum
对象。 -
Datum
class有一个名为label
的成员,你可以从CK数据集中将它设置为AU类,如果这是你希望你的网络分类的话。 / li>
- 将
Datum
对象写入数据库,然后转到下一个图像。
- iterates through a list of images
- loads the images into a
numpy
array. - Constructs a caffe
Datum
object - Assigns the image data to the
Datum
object. - The
Datum
class has a member calledlabel
you can set it to the AU class from your CK dataset, if that is what you want your network to classify. - Writes the
Datum
object to the db and moves on to the next image.
这里 Gustav Larsson在博客文章中将图像转换为lmdb的代码片段。在他的例子中,他为图像分类构建了一个lmdb的图像和标签对。
Here's a code snippet of converting images to an lmdb from a blog post by Gustav Larsson. In his example he constructs an lmdb of images and label pairs for image classification.
将lmdb加载到您的网络中:
这与LeNet示例完全相同。这个数据层位于描述LeNet模型的网络原型文本的开头。
This is done exactly like in the LeNet example. This Data layer at the beginning of the network prototxt that describes the LeNet model.
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
源字段是指向caffe到lmdb位置的位置你刚刚创建的。
The source field is where you point caffe to the location of the lmdb you just created.
与性能更相关的东西,并不是让这个工作起作用的关键是指定如何规范化输入功能。这是通过 transform_param
字段完成的。 CK +具有固定大小的图像,因此无需调整大小。但是,您需要做的一件事是将灰度值标准化。你可以通过平均减法来做到这一点。这样做的一个简单方法是用CK +数据集中灰度强度的平均值替换 transform_param:scale
的值。
Something more related to performance and not critical to getting this to work is specifying how to normalize the input features. This is done through the transform_param
field. CK+ has fixed size images, so no need for resizing. One thing you do need though is normalize the grayscale values. You can do this through mean subtraction. A simple of doing this is to replace the value of transform_param:scale
with the mean value of the gray scale intensities in your CK+ dataset.
这篇关于如何使用caffe convnet库来检测面部表情?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!