如何以HDF5格式输入Caffe多标签数据? [英] How to feed caffe multi label data in HDF5 format?

查看:62
本文介绍了如何以HDF5格式输入Caffe多标签数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用带有矢量标签而不是整数的caffe.我检查了一些答案,看来HDF5是更好的方法.但是后来我陷入了像这样的错误:

I want to use caffe with a vector label, not integer. I have checked some answers, and it seems HDF5 is a better way. But then I'm stucked with error like:

accuracy_layer.cpp:34]检查失败:outer_num_ * inner_num_ == bottom[1]->count()(50对200)标签数必须与预测数相匹配;例如,如果标签轴== 1并且预测形状为(N,C,H,W),则标签计数(标签数量)必须为N*H*W,且整数值为{0、1,...,C- 1}.

accuracy_layer.cpp:34] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (50 vs. 200) Number of labels must match number of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.

将HDF5创建为:

f = h5py.File('train.h5', 'w')
f.create_dataset('data', (1200, 128), dtype='f8')
f.create_dataset('label', (1200, 4), dtype='f4')

我的网络是通过以下方式生成的:

My network is generated by:

def net(hdf5, batch_size):
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
    n.relu2 = L.ReLU(n.ip2, in_place=True)
    n.ip3 = L.InnerProduct(n.relu1, num_output=4, weight_filler=dict(type='xavier'))
    n.accuracy = L.Accuracy(n.ip3, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip3, n.label)
    return n.to_proto()

with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
f.write(str(net('/home/romulus/code/project/train.h5list', 50)))

with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
f.write(str(net('/home/romulus/code/project/test.h5list', 20)))

似乎我应该增加标签数量,并将其放入整数而不是数组中,但是如果我这样做,caffe会抱怨数据数量和标签不相等,然后存在.

It seems I should increase label number and put things in integer rather than array, but if I do this, caffe complains number of data and label is not equal, then exists.

那么,提供多标签数据的正确格式是什么?

So, what is the correct format to feed multi label data?

此外,我很好奇为什么没有人只是简单地编写HDF5映射到caffe blob的数据格式?

Also, I'm so wondering why no one just simply write the data format how HDF5 maps to caffe blobs?

推荐答案

此问题的标题的答案:

HDF5文件的根目录中应有两个数据集,分别名为"data"和"label".形状为(data amountdimension).我仅使用一维数据,所以我不确定channelwidthheight的顺序.也许没关系. dtype应该为float或double.

The HDF5 file should have two dataset in root, named "data" and "label", respectively. The shape is (data amount, dimension). I'm using only one-dimension data, so I'm not sure what's the order of channel, width, and height. Maybe it does not matter. dtype should be float or double.

使用h5py创建火车集的示例代码为:

A sample code creating train set with h5py is:


import h5py, os
import numpy as np

f = h5py.File('train.h5', 'w')
# 1200 data, each is a 128-dim vector
f.create_dataset('data', (1200, 128), dtype='f8')
# Data's labels, each is a 4-dim vector
f.create_dataset('label', (1200, 4), dtype='f4')

# Fill in something with fixed pattern
# Regularize values to between 0 and 1, or SigmoidCrossEntropyLoss will not work
for i in range(1200):
    a = np.empty(128)
    if i % 4 == 0:
        for j in range(128):
            a[j] = j / 128.0;
        l = [1,0,0,0]
    elif i % 4 == 1:
        for j in range(128):
            a[j] = (128 - j) / 128.0;
        l = [1,0,1,0]
    elif i % 4 == 2:
        for j in range(128):
            a[j] = (j % 6) / 128.0;
        l = [0,1,1,0]
    elif i % 4 == 3:
        for j in range(128):
            a[j] = (j % 4) * 4 / 128.0;
        l = [1,0,1,1]
    f['data'][i] = a
    f['label'][i] = l

f.close()

此外,不需要精度层,只需删除它就可以了.下一个问题是损耗层.由于SoftmaxWithLoss仅具有一个输出(具有最大值的维的索引),因此不能用于多标签问题.感谢Adian和Shai,我发现SigmoidCrossEntropyLoss在这种情况下很好.

Also, the accuracy layer is not needed, simply removing it is fine. Next problem is the loss layer. Since SoftmaxWithLoss has only one output (index of the dimension with max value), it can't be used for multi-label problem. Thank to Adian and Shai, I find SigmoidCrossEntropyLoss is good in this case.

下面是完整的代码,来自数据创建,培训网络和获取测试结果:

Below is the full code, from data creation, training network, and getting test result:

main.py(从caffe lanet示例中进行了修改)

main.py (modified from caffe lanet example)


import os, sys

PROJECT_HOME = '.../project/'
CAFFE_HOME = '.../caffe/'
os.chdir(PROJECT_HOME)

sys.path.insert(0, CAFFE_HOME + 'caffe/python')
import caffe, h5py

from pylab import *
from caffe import layers as L

def net(hdf5, batch_size):
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
    n.relu2 = L.ReLU(n.ip2, in_place=True)
    n.ip3 = L.InnerProduct(n.relu2, num_output=4, weight_filler=dict(type='xavier'))
    n.loss = L.SigmoidCrossEntropyLoss(n.ip3, n.label)
    return n.to_proto()

with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
    f.write(str(net(PROJECT_HOME + 'train.h5list', 50)))
with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
    f.write(str(net(PROJECT_HOME + 'test.h5list', 20)))

caffe.set_device(0)
caffe.set_mode_gpu()
solver = caffe.SGDSolver(PROJECT_HOME + 'auto_solver.prototxt')

solver.net.forward()
solver.test_nets[0].forward()
solver.step(1)

niter = 200
test_interval = 10
train_loss = zeros(niter)
test_acc = zeros(int(np.ceil(niter * 1.0 / test_interval)))
print len(test_acc)
output = zeros((niter, 8, 4))

# The main solver loop
for it in range(niter):
    solver.step(1)  # SGD by Caffe
    train_loss[it] = solver.net.blobs['loss'].data
    solver.test_nets[0].forward(start='data')
    output[it] = solver.test_nets[0].blobs['ip3'].data[:8]

    if it % test_interval == 0:
        print 'Iteration', it, 'testing...'
        correct = 0
        data = solver.test_nets[0].blobs['ip3'].data
        label = solver.test_nets[0].blobs['label'].data
        for test_it in range(100):
            solver.test_nets[0].forward()
            # Positive values map to label 1, while negative values map to label 0
            for i in range(len(data)):
                for j in range(len(data[i])):
                    if data[i][j] > 0 and label[i][j] == 1:
                        correct += 1
                    elif data[i][j] %lt;= 0 and label[i][j] == 0:
                        correct += 1
        test_acc[int(it / test_interval)] = correct * 1.0 / (len(data) * len(data[0]) * 100)

# Train and test done, outputing convege graph
_, ax1 = subplots()
ax2 = ax1.twinx()
ax1.plot(arange(niter), train_loss)
ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
ax2.set_ylabel('test accuracy')
_.savefig('converge.png')

# Check the result of last batch
print solver.test_nets[0].blobs['ip3'].data
print solver.test_nets[0].blobs['label'].data

h5list文件仅在每行中包含h5文件的路径:

h5list files simply contain paths of h5 files in each line:

train.h5list

train.h5list

/home/foo/bar/project/train.h5

test.h5list

test.h5list

/home/foo/bar/project/test.h5

和求解器:

auto_solver.prototxt

auto_solver.prototxt

train_net: "auto_train.prototxt"
test_net: "auto_test.prototxt"
test_iter: 10
test_interval: 20
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "sed"
solver_mode: GPU

收敛图:

最后一批结果:


[[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
[ 42.69248581 -43.00864792 13.19664764 -3.35134125]
[ -1.36403108 1.38531208 2.77786589 -0.34310576]
[ 2.91686511 -2.88944006 4.34043217 0.32656598]
...
[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
[ 42.69248581 -43.00864792 13.19664764 -3.35134125]
[ -1.36403108 1.38531208 2.77786589 -0.34310576]
[ 2.91686511 -2.88944006 4.34043217 0.32656598]]

[[ 1. 0. 0. 0.]
[ 1. 0. 1. 0.]
[ 0. 1. 1. 0.]
[ 1. 0. 1. 1.]
...
[ 1. 0. 0. 0.]
[ 1. 0. 1. 0.]
[ 0. 1. 1. 0.]
[ 1. 0. 1. 1.]]

我认为此代码仍有许多要改进的地方.任何建议表示赞赏.

I think this code still has many things to improve. Any suggestion is appreciated.

这篇关于如何以HDF5格式输入Caffe多标签数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆