在没有高级API的情况下重新训练CNN [英] Retraining a CNN without a high-level API

查看：93 发布时间：2020/5/17 19:34:04 python tensorflow neural-network pre-trained-model

本文介绍了在没有高级API的情况下重新训练CNN的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

摘要:我试图在不使用高级API的情况下为MNIST重新训练一个简单的CNN.我已经通过重新培训整个网络而成功地做到了这一点，但是我目前的目标是仅重新培训最后一层或两层全连接"层.

Summary: I am trying to retrain a simple CNN for MNIST without using a high-level API. I already succeeded doing so by retraining the entire network, but my current goal is to retrain only the last one or two Fully Connected layers.

目前为止的工作: 假设我有一个具有以下结构的CNN

Work so far: Say I have a CNN with the following structure

卷积层
RELU
池层
卷积层
RELU
池层
完全连接的层
RELU
退出层
完全连接到10个输出类别的层

Convolutional Layer
RELU
Pooling Layer
Convolutional Layer
RELU
Pooling Layer
Fully Connected Layer
RELU
Dropout Layer
Fully Connected Layer to 10 output classes

我的目标是重新训练最后一个完全连接层或最后两个完全连接层.

My goal is to retrain either the last Fully Connected Layer or the last two Fully Connected Layers.

卷积层的示例:

W_conv1 = tf.get_variable("W", [5, 5, 1, 32],
      initializer=tf.truncated_normal_initializer(stddev=np.sqrt(2.0 / 784)))
b_conv1 = tf.get_variable("b", initializer=tf.constant(0.1, shape=[32]))
z = tf.nn.conv2d(x_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME')
z += b_conv1
h_conv1 = tf.nn.relu(z + b_conv1)

完全连接层的示例:

input_size = 7 * 7 * 64
W_fc1 = tf.get_variable("W", [input_size, 1024], initializer=tf.truncated_normal_initializer(stddev=np.sqrt(2.0/input_size)))
b_fc1 = tf.get_variable("b", initializer=tf.constant(0.1, shape=[1024]))
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

我的假设:在新数据集上进行反向传播时，我只需确保将权重W和b(来自W * x + b)固定在未完全连接的图层中.

My assumption: When performing the backpropagation on the new dataset, I simply make sure that my weights W and b (from W*x+b) are fixed in the non-fully connected layers.

关于如何执行此操作的第一个想法:保存W和b，执行向后传播步骤，然后在我不想更改的图层中用旧的W和b替换新的W和b .

A first thought on how to do this: Save the W and b, perform a backpropagation step, and replace the new W and b with the old one in the layers I don't want changed.

我对第一种方法的想法:

这是计算密集型工作，浪费了内存.只做最后一层的全部好处就是不必去做其他的事情
如果不应用于所有图层，反向传播功能可能会有所不同?

我的问题 :

My question:

当不使用高级API时，如何正确地训练神经网络中的特定层.无论是概念上的还是编码上的答案都是受欢迎的.

PS .完全了解如何使用高级API做到这一点.示例: https://towardsdatascience.com/how-to-训练您的模型的速度更快9ad063f0f718 .只是不想让神经网络变得神奇，我想知道实际发生的事情

P.S. Fully aware how one can do it using high-level APIs. Example: https://towardsdatascience.com/how-to-train-your-model-dramatically-faster-9ad063f0f718. Just don't want Neural Networks to be magic, I want to know what actually happens

推荐答案

优化器的Minimal函数具有一个可选参数，用于选择要训练的变量，例如:

The minimize function of optimizers has an optional argument for choosing which variables to train, e.g.:

optimizer_step = tf.train.MomentumOptimizer(learning_rate, momentum, name='MomentumOptimizer').minimize(loss, var_list=training_variables)

您可以使用tf.trainable_variables()获得要训练的图层的变量:

You can get the variables for the layers you want to train by using tf.trainable_variables():

vars1 = tf.trainable_variables()

# FC Layer
input_size = 7 * 7 * 64
W_fc1 = tf.get_variable("W", [input_size, 1024], initializer=tf.truncated_normal_initializer(stddev=np.sqrt(2.0/input_size)))
b_fc1 = tf.get_variable("b", initializer=tf.constant(0.1, shape=[1024]))
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

vars2 = tf.trainable_variables()

training_variables = list(set(vars2) - set(vars1))

实际上，在这种情况下使用tf.trainable_variables可能会过大，因为您直接拥有W_fc1和b_fc1.例如，如果您使用tf.layers.dense来创建一个密集层，而在该层中您没有明确的变量，则这将很有用.

actually, using tf.trainable_variables is probably overkill in this case, since you have W_fc1 and b_fc1 directly. This would be useful for example if you had used tf.layers.dense to create a dense layer, where you would not have the variables explicitly.

这篇关于在没有高级API的情况下重新训练CNN的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在没有高级API的情况下重新训练CNN [英] Retraining a CNN without a high-level API

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在没有高级API的情况下重新训练CNN [英] Retraining a CNN without a high-level API

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭