将bvlc_googlenet用作数字预训练模型-错误 [英] Using bvlc_googlenet as pretrained model in digits - errors

查看:264
本文介绍了将bvlc_googlenet用作数字预训练模型-错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 位数4.0 0.14.0-rc.3 / Ubuntu(aws)

训练一个5类GoogLenet模型,每个模型中约有800个训练样本。我试图使用bvlc_imagent作为预训练模型。这些是我采取的步骤:


  1. http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel 并将其放在/ home / ubuntu / models


  2. 2.

a。从此处 https:// github粘贴 train_val.prototxt。 com / BVLC / caffe / blob / master / models / bvlc_reference_caffenet / train_val.prototxt 进入自定义网络标签,然后



b。 '#'注释掉了源和后端行(因为它在抱怨它们)


  1. 在预训练的模型文本框将路径粘贴到 .caffemodel。就我而言: / home / ubuntu / models / bvlc_googlenet.caffemodel

我收到此错误:

 错误:无法从图层 loss1 / classifier复制参数0权重;形状不匹配。源参数形状为1 1 1000 1024(1024000);目标参数形状为6 1024(6144)。要从头开始学习该层的参数,而不是从保存的网络中复制,请重命名该层。 

我从github问题中粘贴了各种train_val.prototext等,不幸的是,没有运气,



我不确定为什么它变得如此复杂,在较旧版本的数字中,我们只需输入文件夹的路径即可,并且非常适合进行迁移学习。



有人可以帮忙吗?

解决方案

从 loss1重命名图层/ classifier改为 loss1 / classifier_retrain。



微调模型时,Caffe会执行以下操作:

 #new_model中的层的伪代码
:如果old_model中的layer.name为

new_model.layer.weights = old_model.layer.weights

由于 loss1 / classifier的权重适用于1000级,因此您会收到错误消息分类问题(1000x1024),并且您尝试将它们复制到6类分类问题(6x1024)的图层中。重命名图层时,Caffe不会尝试复制该图层的权重,而是会获得随机初始化的权重-这就是您想要的。



我也建议您可以使用此网络描述,该网络描述已经设置为GoogLeNet的多合一网络描述。

https://github.com/NVIDIA/DIGITS/blob/digits-4.0/digits/standard-networks/caffe/googlenet.prototxt


digits 4.0 0.14.0-rc.3 /Ubuntu (aws)

training a 5 class GoogLenet model with about 800 training samples in each class. I was trying to use the bvlc_imagent as pre-trained model. These are the steps I took:

  1. downloaded imagenet from http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel and placed it in /home/ubuntu/models

  2. 2.

a. Pasted the "train_val.prototxt" from here https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/train_val.prototxt into the custom network tab and

b. '#' commented out the "source" and "backend" lines (since it was complaning about them)

  1. In the pre-trained models text box pasted the path to the '.caffemodel'. in my case: "/home/ubuntu/models/bvlc_googlenet.caffemodel"

I get this error:

ERROR: Cannot copy param 0 weights from layer 'loss1/classifier'; shape mismatch. Source param shape is 1 1 1000 1024 (1024000); target param shape is 6 1024 (6144). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

I have pasted various train_val.prototext from github issues etc and no luck unfortunately,

I am not sure why this is getting so complicated, in older versions of digits, we could just enter the path to the folder and it was working great for transfer learning.

Could someone help?

解决方案

Rename the layer from "loss1/classifier" to "loss1/classifier_retrain".

When fine-tuning a model, here's what Caffe does:

# pseudo-code
for layer in new_model:
  if layer.name in old_model:
    new_model.layer.weights = old_model.layer.weights

You're getting an error because the weights for "loss1/classifier" were for a 1000-class classification problem (1000x1024), and you're trying to copy them into a layer for a 6-class classification problem (6x1024). When you rename the layer, Caffe doesn't try to copy the weights for that layer and you get randomly initialized weights - which is what you want.

Also, I suggest you use this network description which is already set up as an all-in-one network description for GoogLeNet. It will save you some trouble.

https://github.com/NVIDIA/DIGITS/blob/digits-4.0/digits/standard-networks/caffe/googlenet.prototxt

这篇关于将bvlc_googlenet用作数字预训练模型-错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆