如何以 HDF5 格式提供 caffe 多标签数据? [英] How to feed caffe multi label data in HDF5 format?

查看:18
本文介绍了如何以 HDF5 格式提供 caffe 多标签数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用带有矢量标签的 caffe,而不是整数.我查了一些答案,似乎 HDF5 是更好的方法.但是后来我遇到了如下错误:

<块引用>

accuracy_layer.cpp:34] 检查失败:outer_num_ * inner_num_ == bottom[1]->count() (50 vs. 200) 标签数量必须与预测数量匹配;例如,如果标签轴 == 1 且预测形状为 (N, C, H, W),则标签计数(标签数量)必须为 N*H*W,整数值在 {0, 1, ..., C-1}.

使用 HDF5 创建为:

f = h5py.File('train.h5', 'w')f.create_dataset('data', (1200, 128), dtype='f8')f.create_dataset('label', (1200, 4), dtype='f4')

我的网络是由:

def net(hdf5, batch_size):n = caffe.NetSpec()n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))n.relu1 = L.ReLU(n.ip1, in_place=True)n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))n.relu2 = L.ReLU(n.ip2, in_place=True)n.ip3 = L.InnerProduct(n.relu1, num_output=4, weight_filler=dict(type='xavier'))n.accuracy = L.Accuracy(n.ip3, n.label)n.loss = L.SoftmaxWithLoss(n.ip3, n.label)返回 n.to_proto()with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:f.write(str(net('/home/romulus/code/project/train.h5list', 50)))with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:f.write(str(net('/home/romulus/code/project/test.h5list', 20)))

看来我应该增加标签数量并将事物放入整数而不是数组中,但是如果我这样做,caffe会抱怨数据和标签数量不相等,然后存在.

那么,提供多标签数据的正确格式是什么?

另外,我很想知道为什么没有人只是简单地编写 HDF5 如何映射到 caffe blob 的数据格式?

解决方案

回答这个问题的标题:

HDF5 文件的根目录应该有两个数据集,分别命名为data"和label".形状为(数据量, dimension).我只使用一维数据,所以我不确定 channelwidthheight 的顺序是什么.也许没关系.dtype 应该是 float 或 double.

使用 h5py 创建训练集的示例代码是:

<前>导入 h5py,操作系统将 numpy 导入为 npf = h5py.File('train.h5', 'w')# 1200条数据,每条都是128-dim的向量f.create_dataset('data', (1200, 128), dtype='f8')# 数据的标签,每个都是一个 4-dim 向量f.create_dataset('label', (1200, 4), dtype='f4')# 用固定模式填充一些东西# 将值正则化到 0 到 1 之间,否则 SigmoidCrossEntropyLoss 将不起作用对于我在范围内(1200):a = np.empty(128)如果我 % 4 == 0:对于范围内的 j(128):[j] = j/128.0;l = [1,0,0,0]elif i % 4 == 1:对于范围内的 j(128):a[j] = (128 - j)/128.0;l = [1,0,1,0]elif i % 4 == 2:对于范围内的 j(128):a[j] = (j % 6)/128.0;l = [0,1,1,0]elif i % 4 == 3:对于范围内的 j(128):a[j] = (j % 4) * 4/128.0;l = [1,0,1,1]f['数据'][i] = af['标签'][i] = lf.close()

此外,不需要精度层,只需将其删除即可.下一个问题是损失层.由于SoftmaxWithLoss 只有一个输出(具有最大值的维度的索引),因此不能用于多标签问题.感谢 Adian 和 Shai,我发现 SigmoidCrossEntropyLoss 在这种情况下很好.

以下是完整的代码,从数据创建、训练网络到获取测试结果:

<块引用>

main.py(从 caffe lanet 示例修改而来)

<前>导入操作系统,系统PROJECT_HOME = '.../项目/'CAFFE_HOME = '.../caffe/'os.chdir(PROJECT_HOME)sys.path.insert(0, CAFFE_HOME + 'caffe/python')进口咖啡,h5py从 pylab 导入 *从 caffe 导入图层为 L定义网络(hdf5,batch_size):n = caffe.NetSpec()n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))n.relu1 = L.ReLU(n.ip1, in_place=True)n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))n.relu2 = L.ReLU(n.ip2, in_place=True)n.ip3 = L.InnerProduct(n.relu2, num_output=4, weight_filler=dict(type='xavier'))n.loss = L.SigmoidCrossEntropyLoss(n.ip3, n.label)返回 n.to_proto()with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:f.write(str(net(PROJECT_HOME + 'train.h5list', 50)))with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:f.write(str(net(PROJECT_HOME + 'test.h5list', 20)))caffe.set_device(0)caffe.set_mode_gpu()求解器 = caffe.SGDSolver(PROJECT_HOME + 'auto_solver.prototxt')solver.net.forward()solver.test_nets[0].forward()求解器.step(1)硝 = 200测试间隔 = 10train_loss = zeros(niter)test_acc = zeros(int(np.ceil(niter * 1.0/test_interval)))打印 len(test_acc)输出 = 零((硝子,8, 4))# 主求解器循环对于它在范围内(尼特):solver.step(1) # Caffe 的 SGDtrain_loss[it] = solver.net.blobs['loss'].datasolver.test_nets[0].forward(start='data')输出[it] = solver.test_nets[0].blobs['ip3'].data[:8]如果它 % test_interval == 0:打印迭代",它,测试..."正确 = 0数据 = solver.test_nets[0].blobs['ip3'].datalabel = solver.test_nets[0].blobs['label'].data对于范围内的 test_it(100):solver.test_nets[0].forward()# 正值映射到标签 1,而负值映射到标签 0对于我在范围内(len(数据)):对于范围内的 j(len(data[i])):如果数据[i][j] > 0 和标签[i][j] == 1:正确 += 1elif 数据[i][j] %lt;= 0 和标签[i][j] == 0:正确 += 1test_acc[int(it/test_interval)] = 正确 * 1.0/(len(data) * len(data[0]) * 100)# 训练和测试完成,输出收敛图_, ax1 = subplots()ax2 = ax1.twinx()ax1.plot(arange(niter),train_loss)ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r')ax1.set_xlabel('迭代')ax1.set_ylabel('训练损失')ax2.set_ylabel('测试准确率')_.savefig('converge.png')# 查看上一批的结果打印solver.test_nets[0].blobs['ip3'].data打印solver.test_nets[0].blobs['label'].data

h5list 文件在每一行中只包含 h5 文件的路径:

<块引用>

train.h5list

/home/foo/bar/project/train.h5

<块引用>

test.h5list

/home/foo/bar/project/test.h5

和求解器:

<块引用>

auto_solver.prototxt

train_net: "auto_train.prototxt"test_net: "auto_test.prototxt"test_iter:10测试间隔:20base_lr:0.01动量:0.9weight_decay:0.0005lr_policy: "inv"伽玛:0.0001功率:0.75显示:100max_iter:10000快照:5000快照前缀:sed"求解器模式:GPU

收敛图:

最后一批结果:

<前>[[ 35.91593933 -37.46276474 -6.2579031 -6.30313492][ 42.69248581 -43.00864792 13.19664764 -3.35134125][ -1.36403108 1.38531208 2.77786589 -0.34310576][ 2.91686511 -2.88944006 4.34043217 0.32656598]...[ 35.91593933 -37.46276474 -6.2579031 -6.30313492][ 42.69248581 -43.00864792 13.19664764 -3.35134125][ -1.36403108 1.38531208 2.77786589 -0.34310576][ 2.91686511 -2.88944006 4.34043217 0.32656598]][[1.0.0.0.][ 1. 0. 1. 0.][ 0. 1. 1. 0. ][ 1. 0. 1. 1.]...[ 1. 0. 0. 0. ][ 1. 0. 1. 0.][ 0. 1. 1. 0. ][ 1. 0. 1. 1.]]

我认为这段代码还有很多需要改进的地方.任何建议表示赞赏.

I want to use caffe with a vector label, not integer. I have checked some answers, and it seems HDF5 is a better way. But then I'm stucked with error like:

accuracy_layer.cpp:34] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (50 vs. 200) Number of labels must match number of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.

with HDF5 created as:

f = h5py.File('train.h5', 'w')
f.create_dataset('data', (1200, 128), dtype='f8')
f.create_dataset('label', (1200, 4), dtype='f4')

My network is generated by:

def net(hdf5, batch_size):
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
    n.relu2 = L.ReLU(n.ip2, in_place=True)
    n.ip3 = L.InnerProduct(n.relu1, num_output=4, weight_filler=dict(type='xavier'))
    n.accuracy = L.Accuracy(n.ip3, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip3, n.label)
    return n.to_proto()

with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
f.write(str(net('/home/romulus/code/project/train.h5list', 50)))

with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
f.write(str(net('/home/romulus/code/project/test.h5list', 20)))

It seems I should increase label number and put things in integer rather than array, but if I do this, caffe complains number of data and label is not equal, then exists.

So, what is the correct format to feed multi label data?

Also, I'm so wondering why no one just simply write the data format how HDF5 maps to caffe blobs?

解决方案

Answer to this question's title:

The HDF5 file should have two dataset in root, named "data" and "label", respectively. The shape is (data amount, dimension). I'm using only one-dimension data, so I'm not sure what's the order of channel, width, and height. Maybe it does not matter. dtype should be float or double.

A sample code creating train set with h5py is:

import h5py, os
import numpy as np

f = h5py.File('train.h5', 'w')
# 1200 data, each is a 128-dim vector
f.create_dataset('data', (1200, 128), dtype='f8')
# Data's labels, each is a 4-dim vector
f.create_dataset('label', (1200, 4), dtype='f4')

# Fill in something with fixed pattern
# Regularize values to between 0 and 1, or SigmoidCrossEntropyLoss will not work
for i in range(1200):
    a = np.empty(128)
    if i % 4 == 0:
        for j in range(128):
            a[j] = j / 128.0;
        l = [1,0,0,0]
    elif i % 4 == 1:
        for j in range(128):
            a[j] = (128 - j) / 128.0;
        l = [1,0,1,0]
    elif i % 4 == 2:
        for j in range(128):
            a[j] = (j % 6) / 128.0;
        l = [0,1,1,0]
    elif i % 4 == 3:
        for j in range(128):
            a[j] = (j % 4) * 4 / 128.0;
        l = [1,0,1,1]
    f['data'][i] = a
    f['label'][i] = l

f.close()

Also, the accuracy layer is not needed, simply removing it is fine. Next problem is the loss layer. Since SoftmaxWithLoss has only one output (index of the dimension with max value), it can't be used for multi-label problem. Thank to Adian and Shai, I find SigmoidCrossEntropyLoss is good in this case.

Below is the full code, from data creation, training network, and getting test result:

main.py (modified from caffe lanet example)

import os, sys

PROJECT_HOME = '.../project/'
CAFFE_HOME = '.../caffe/'
os.chdir(PROJECT_HOME)

sys.path.insert(0, CAFFE_HOME + 'caffe/python')
import caffe, h5py

from pylab import *
from caffe import layers as L

def net(hdf5, batch_size):
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
    n.relu2 = L.ReLU(n.ip2, in_place=True)
    n.ip3 = L.InnerProduct(n.relu2, num_output=4, weight_filler=dict(type='xavier'))
    n.loss = L.SigmoidCrossEntropyLoss(n.ip3, n.label)
    return n.to_proto()

with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
    f.write(str(net(PROJECT_HOME + 'train.h5list', 50)))
with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
    f.write(str(net(PROJECT_HOME + 'test.h5list', 20)))

caffe.set_device(0)
caffe.set_mode_gpu()
solver = caffe.SGDSolver(PROJECT_HOME + 'auto_solver.prototxt')

solver.net.forward()
solver.test_nets[0].forward()
solver.step(1)

niter = 200
test_interval = 10
train_loss = zeros(niter)
test_acc = zeros(int(np.ceil(niter * 1.0 / test_interval)))
print len(test_acc)
output = zeros((niter, 8, 4))

# The main solver loop
for it in range(niter):
    solver.step(1)  # SGD by Caffe
    train_loss[it] = solver.net.blobs['loss'].data
    solver.test_nets[0].forward(start='data')
    output[it] = solver.test_nets[0].blobs['ip3'].data[:8]

    if it % test_interval == 0:
        print 'Iteration', it, 'testing...'
        correct = 0
        data = solver.test_nets[0].blobs['ip3'].data
        label = solver.test_nets[0].blobs['label'].data
        for test_it in range(100):
            solver.test_nets[0].forward()
            # Positive values map to label 1, while negative values map to label 0
            for i in range(len(data)):
                for j in range(len(data[i])):
                    if data[i][j] > 0 and label[i][j] == 1:
                        correct += 1
                    elif data[i][j] %lt;= 0 and label[i][j] == 0:
                        correct += 1
        test_acc[int(it / test_interval)] = correct * 1.0 / (len(data) * len(data[0]) * 100)

# Train and test done, outputing convege graph
_, ax1 = subplots()
ax2 = ax1.twinx()
ax1.plot(arange(niter), train_loss)
ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
ax2.set_ylabel('test accuracy')
_.savefig('converge.png')

# Check the result of last batch
print solver.test_nets[0].blobs['ip3'].data
print solver.test_nets[0].blobs['label'].data

h5list files simply contain paths of h5 files in each line:

train.h5list

/home/foo/bar/project/train.h5

test.h5list

/home/foo/bar/project/test.h5

and the solver:

auto_solver.prototxt

train_net: "auto_train.prototxt"
test_net: "auto_test.prototxt"
test_iter: 10
test_interval: 20
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "sed"
solver_mode: GPU

Converge graph:

Last batch result:

[[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
[ 42.69248581 -43.00864792 13.19664764 -3.35134125]
[ -1.36403108 1.38531208 2.77786589 -0.34310576]
[ 2.91686511 -2.88944006 4.34043217 0.32656598]
...
[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
[ 42.69248581 -43.00864792 13.19664764 -3.35134125]
[ -1.36403108 1.38531208 2.77786589 -0.34310576]
[ 2.91686511 -2.88944006 4.34043217 0.32656598]]

[[ 1. 0. 0. 0.]
[ 1. 0. 1. 0.]
[ 0. 1. 1. 0.]
[ 1. 0. 1. 1.]
...
[ 1. 0. 0. 0.]
[ 1. 0. 1. 0.]
[ 0. 1. 1. 0.]
[ 1. 0. 1. 1.]]

I think this code still has many things to improve. Any suggestion is appreciated.

这篇关于如何以 HDF5 格式提供 caffe 多标签数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆