如何在张量流中使用预训练模型作为不可训练的子网络? [英] How to use pre-trained model as non trainable sub network in tensorflow?

查看:27
本文介绍了如何在张量流中使用预训练模型作为不可训练的子网络?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想训练一个网络,其中包含一个我需要在训练期间保持固定的子网络.基本思想是在预训练网络 (inceptionV3)

I'd like to train a network that contains a sub network that I need to stay fix during the training. The basic idea is to prepend and append some layers the the pre-trained network (inceptionV3)

new_layers -> pre-trained and fixed sub-net (inceptionv3) -> new_layers

并在不更改预先训练的任务的情况下为我的任务运行训练过程.我还需要直接在预训练网络的某个层上进行分支.例如,对于 inceptionV3,我喜欢从 conv 299x299 到最后一个池层或从 conv 79x79 到最后一个池层.

and run the training process for the task I have without changing the pre-trained one. I also need to branch directly on some layer of the pre-trained network. For example, with the inceptionV3 I like to uses it from the conv 299x299 to the last pool layer or from the conv 79x79 to the last pool layer.

推荐答案

是否训练层"取决于该层中使用的变量是否随梯度更新.如果您使用 Optimizer 界面来优化您的网络,那么您不能简单地将要保持固定的层中使用的变量传递给 minimize 函数,即,

Whether or not a "layer" is trained is determined by whether the variables used in that layer get updated with gradients. If you are using the Optimizer interface to optimize your network, then you can simply not pass the variables used in the layers that you want to keep fixed to the minimize function, i.e.,

opt.minimize(loss, <subset of variables you want to train>)

如果您直接使用 tf.gradients 函数,则将要保持固定的变量从 tf.gradients 的第二个参数中删除.

If you are using tf.gradients function directly, then remove the variables that you want to keep fixed from the second argument to tf.gradients.

现在,您如何直接分支"到预训练网络的一层取决于该网络的实现方式.我会简单地将 tf.Conv2D 调用定位到您正在谈论的 299x299 层,并将新层的输出作为其输入传递,并在输出端找到 79x79 层,使用其输出作为新层的输入.

Now, how you "branch directly" to a layer of a pre-trained network depends on how that network is implemented. I would simply locate the tf.Conv2D call to the 299x299 layer you are talking about, and pass as its input, the output of your new layer, and on the output side, locate the 79x79 layer, use its output as the input to your new layer.

这篇关于如何在张量流中使用预训练模型作为不可训练的子网络?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆