使用 bvlc_googlenet 作为数字中的预训练模型 - 错误 [英] Using bvlc_googlenet as pretrained model in digits - errors

查看:43
本文介绍了使用 bvlc_googlenet 作为数字中的预训练模型 - 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

digits 4.0 0.14.0-rc.3 /Ubuntu (aws)

训练一个 5 类的 GoogLenet 模型,每类大约有 800 个训练样本.我试图使用 bvlc_imagent 作为预训练模型.这些是我采取的步骤:

training a 5 class GoogLenet model with about 800 training samples in each class. I was trying to use the bvlc_imagent as pre-trained model. These are the steps I took:

  1. http://dl.caffe.berkeleyvision.org/bvlc_googlenet 下载的图像网.caffemodel 并将其放置在/home/ubuntu/models

  1. downloaded imagenet from http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel and placed it in /home/ubuntu/models

一个.从这里粘贴train_val.prototxt"https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/train_val.prototxt 进入自定义网络选项卡和

a. Pasted the "train_val.prototxt" from here https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/train_val.prototxt into the custom network tab and

B.'#' 注释掉了 "source" 和 "backend" 行(因为它在抱怨它们)

b. '#' commented out the "source" and "backend" lines (since it was complaning about them)

  1. 在预训练模型文本框中粘贴了.caffemodel"的路径.就我而言:/home/ubuntu/models/bvlc_googlenet.caffemodel"

我收到此错误:

ERROR: Cannot copy param 0 weights from layer 'loss1/classifier'; shape mismatch. Source param shape is 1 1 1000 1024 (1024000); target param shape is 6 1024 (6144). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

我已经粘贴了来自 github 问题等的各种 train_val.prototext 不幸的是没有运气,

I have pasted various train_val.prototext from github issues etc and no luck unfortunately,

我不知道为什么这会变得如此复杂,在旧版本的数字中,我们可以只输入文件夹的路径,它非常适合迁移学习.

I am not sure why this is getting so complicated, in older versions of digits, we could just enter the path to the folder and it was working great for transfer learning.

有人可以帮忙吗?

推荐答案

将图层从loss1/classifier"重命名为loss1/classifier_retrain".

Rename the layer from "loss1/classifier" to "loss1/classifier_retrain".

在微调模型时,Caffe 会这样做:

When fine-tuning a model, here's what Caffe does:

# pseudo-code
for layer in new_model:
  if layer.name in old_model:
    new_model.layer.weights = old_model.layer.weights

您收到错误,因为loss1/classifier"的权重是针对 1000 类分类问题 (1000x1024),而您正尝试将它们复制到针对 6 类分类问题 (6x1024) 的层中).当您重命名图层时,Caffe 不会尝试复制该图层的权重,您将获得随机初始化的权重 - 这正是您想要的.

You're getting an error because the weights for "loss1/classifier" were for a 1000-class classification problem (1000x1024), and you're trying to copy them into a layer for a 6-class classification problem (6x1024). When you rename the layer, Caffe doesn't try to copy the weights for that layer and you get randomly initialized weights - which is what you want.

另外,我建议你使用这个网络描述,它已经被设置为 GoogLeNet 的多合一网络描述.它会为您省去一些麻烦.

Also, I suggest you use this network description which is already set up as an all-in-one network description for GoogLeNet. It will save you some trouble.

https://github.com/NVIDIA/DIGITS/blob/digits-4.0/digits/standard-networks/caffe/googlenet.prototxt

这篇关于使用 bvlc_googlenet 作为数字中的预训练模型 - 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆