Food101 SqueezeNet Caffe2迭代次数 [英] Food101 SqueezeNet Caffe2 number of iterations

查看：90 发布时间：2020/5/17 19:26:36 python neural-network caffe conv-neural-network caffe2

本文介绍了Food101 SqueezeNet Caffe2迭代次数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用Caffe2中的squeezenet对ETH Food-101数据集进行分类.我的模型是从Model Zoo导入的，并且对模型进行了两种修改:

I am trying to classify the ETH Food-101 dataset using squeezenet in Caffe2. My model is imported from the Model Zoo and I made two types of modifications to the model:

1)更改最后一层的尺寸以具有101个输出

1) Changing the dimensions of the last layer to have 101 outputs

2)数据库中的图像为NHWC格式，我只是将权重的尺寸进行了翻转以进行匹配. (我打算对此进行更改)

2) The images from the database are in NHWC form and I just flipped the dimensions of the weights to match. (I plan on changing this)

Food101数据集有75,000张图像用于训练，我目前使用的批次大小为128，起始学习率为-0.01，伽玛为0.999，步长为1.网络的准确度徘徊在1/128左右，这需要一个小时左右的时间才能完成.

The Food101 dataset has 75,000 images for training and I am currently using a batch size of 128 and a starting learning rate of -0.01 with a gamma of 0.999 and stepsize of 1. What I noticed is that for the first 2000 iterations of the network the accuracy hovered around 1/128 and this took an hour or so to complete.

我将所有权重添加到了model.params中，以便它们可以在梯度下降期间得到更新(数据除外)，并将所有权重重新初始化为Xavier并偏置为常数.我希望精度在前十万次迭代中会相当快地增长，然后随着迭代次数的增加而下降.就我而言，学习保持恒定在0附近.

I added all the weights to the model.params so they can get updated during gradient descent(except for data) and reinitialized all weights as Xavier and biases to constant. I would expect the accuracy to grow fairly quickly in the first hundred to thousand iterations and then tail off as the number of iterations grow. In my case, the learning is staying constant around 0.

当我查看渐变文件时，我发现平均值约为10 ^ -6，标准偏差为10 ^ -7.这解释了学习速度慢的原因，但是我无法使梯度开始变得更高.

When I look at the gradient file I find that the average is on the order of 10^-6 with a standard deviation of 10^-7. This explains the slow learning rate, but I haven't been able to get the gradient to start much higher.

这些是经过几次迭代后第一个卷积的梯度统计信息

These are the gradient statistics for the first convolution after a few iterations

    Min        Max          Avg       Sdev
-1.69821e-05 2.10922e-05 1.52149e-06 5.7707e-06
-1.60263e-05 2.01478e-05 1.49323e-06 5.41754e-06
-1.62501e-05 1.97764e-05 1.49046e-06 5.2904e-06
-1.64293e-05 1.90508e-05 1.45681e-06 5.22742e-06

这是我代码的核心部分:

Here are the core parts of my code:

#init_path is path to init_net protobuf 
#pred_path is path to pred_net protobuf
def main(init_path, pred_path):
    ws.ResetWorkspace()
    data_folder = '/home/myhome/food101/'
    #some debug code here
    arg_scope = {"order":"NCHW"}
    train_model = model_helper.ModelHelper(name="food101_train", arg_scope=arg_scope)
    if not debug:
            data, label = AddInput(
                    train_model, batch_size=128,
                    db=os.path.join(data_folder, 'food101-train-nchw-leveldb'),
                    db_type='leveldb')
    init_net_def, pred_net_def = update_squeeze_net(init_path, pred_path)
    #print str(init_net_def)
    train_model.param_init_net.AppendNet(core.Net(init_net_def))
    train_model.net.AppendNet(core.Net(pred_net_def))
    ws.RunNetOnce(train_model.param_init_net)
    add_params(train_model, init_net_def)
    AddTrainingOperators(train_model, 'softmaxout', 'label')
    AddBookkeepingOperators(train_model)

    ws.RunNetOnce(train_model.param_init_net)
    if debug:
            ws.FeedBlob('data', data)
            ws.FeedBlob('label', label)
    ws.CreateNet(train_model.net)

    total_iters = 10000
    accuracy = np.zeros(total_iters)
    loss = np.zeros(total_iters)
    # Now, we will manually run the network for 200 iterations.
    for i in range(total_iters):
            #try:
            conv1_w = ws.FetchBlob('conv1_w')
            print conv1_w[0][0]
            ws.RunNet("food101_train")
            #except RuntimeError:
            #       print ws.FetchBlob('conv1').shape
            #       print ws.FetchBlob('pool1').shape
            #       print ws.FetchBlob('fire2/squeeze1x1_w').shape
            #       print ws.FetchBlob('fire2/squeeze1x1_b').shape
            #softmax = ws.FetchBlob('softmaxout')
            #print softmax[i]
            #print softmax[i][0][0]
            #print softmax[i][0][:5]
            #print softmax[64*i]
            accuracy[i] = ws.FetchBlob('accuracy')
            loss[i] = ws.FetchBlob('loss')
            print accuracy[i], loss[i]

我的add_params函数按如下所示初始化权重

My add_params function initializes the weights as follows

#ops allows me to only initialize the weights of specific ops because i initially was going to do last layer training
def add_params(model, init_net_def, ops=[]):
    def add_param(op):
            for output in op.output:
                    if "_w" in output:
                            weight_shape = []
                            for arg in op.arg:
                                    if arg.name == 'shape':
                                            weight_shape = arg.ints
                            weight_initializer = initializers.update_initializer(
                                                    None,
                                                    None,
                                                    ("XavierFill", {}))
                            model.create_param(
                                    param_name=output,
                                    shape=weight_shape,
                                    initializer=weight_initializer,
                                    tags=ParameterTags.WEIGHT)
                    elif "_b" in output:
                            weight_shape = []
                            for arg in op.arg:
                                    if arg.name == 'shape':
                                            weight_shape = arg.ints
                            weight_initializer = initializers.update_initializer(
                                                    None,
                                                    None,
                                                    ("ConstantFill", {}))
                            model.create_param(
                                    param_name=output,
                                    shape=weight_shape,
                                    initializer=weight_initializer,

我发现当我使用完整的训练集时，损失函数会波动，但是如果我只使用一个批次并对其进行多次迭代，我会发现损失函数会下降但非常缓慢.

I find that my loss function fluctuates when I use the full training set, but If i use just one batch and iterate over it several times I find that the loss function goes down but very slowly.

Food101 SqueezeNet Caffe2迭代次数 [英] Food101 SqueezeNet Caffe2 number of iterations

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Food101 SqueezeNet Caffe2迭代次数 [英] Food101 SqueezeNet Caffe2 number of iterations

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭