为什么在检查点上每个层都有两个附加变量? [英] Why are there two additional variables, in the checkpoint, for each layer?

查看:81
本文介绍了为什么在检查点上每个层都有两个附加变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个包含三个卷积层和两个完全连接层的卷积神经网络.我用tf.train.saver()保存变量. 当我使用inspect_checkpoint.py来检查保存在检查点文件中的变量时.为什么要为每个图层保存两个其他变量,例如Adam_1Adam?另外,什么是beta1_powerbeta2_power?

I created a convolutional neural network with three convolutional layers and two fully connected layers. I used tf.train.saver() to save the variables. When I use inspect_checkpoint.py to check the variables saved in the checkpoint file. Why are there two additional variables saved for each layer, like Adam_1 and Adam? Also, what are beta1_power and beta2_power?

conv_layer1_b  (DT_FLOAT)  [32]

conv_layer1_w  (DT_FLOAT)  [1,16,1,32]

conv_layer1_b/Adam  (DT_FLOAT)  [32]

conv_layer1_w/Adam (DT_FLOAT) [1,16,1,32]

conv_layer1_w/Adam_1 (DT_FLOAT) [1,16,1,32]

conv_layer1_b/Adam_1 (DT_FLOAT) [32]

conv_layer3_w/Adam (DT_FLOAT) [1,16,64,64]

conv_layer3_w (DT_FLOAT) [1,16,64,64]

conv_layer3_b/Adam_1 (DT_FLOAT) [64]

conv_layer3_b (DT_FLOAT) [64]

conv_layer3_b/Adam (DT_FLOAT) [64]

conv_layer3_w/Adam_1 (DT_FLOAT) [1,16,64,64]

conv_layer2_w/Adam_1 (DT_FLOAT) [1,16,32,64]

conv_layer2_w/Adam (DT_FLOAT) [1,16,32,64]

conv_layer2_w (DT_FLOAT) [1,16,32,64]

conv_layer2_b/Adam_1 (DT_FLOAT) [64]

conv_layer2_b (DT_FLOAT) [64]

conv_layer2_b/Adam (DT_FLOAT) [64]

beta1_power (DT_FLOAT) []

beta2_power (DT_FLOAT) []

NN1_w (DT_FLOAT) [2432,512]

NN1_b (DT_FLOAT) [512]

NN1_w/Adam_1 (DT_FLOAT) [2432,512]

NN1_b/Adam_1 (DT_FLOAT) [512]

NN1_w/Adam (DT_FLOAT) [2432,512]

NN1_b/Adam (DT_FLOAT) [512]

NN2_w (DT_FLOAT) [512,2]

NN2_b (DT_FLOAT) [2]

NN2_w/Adam_1 (DT_FLOAT) [512,2]

NN2_b/Adam_1 (DT_FLOAT) [2]

NN2_w/Adam (DT_FLOAT) [512,2]

NN2_b/Adam (DT_FLOAT) [2]

推荐答案

您正在使用Adam优化器( https ://arxiv.org/abs/1412.6980 )进行优化. Adam有两个状态变量,用于存储与参数大小相同的梯度统计信息(算法1),这是每个参数变量另外两个变量.优化器本身具有一些超参数,其中β 1 和β 2 ,在您的情况下,我猜它们存储为变量.

You're using the Adam optimizer (https://arxiv.org/abs/1412.6980) for optimization. Adam has two state variables to store statistics about the gradients which are the same size as the parameters (Algorithm 1), which is your two additional variables per parameter variable. The optimizer itself has a few hyperparameters, among them β1 and β2, which I guess are in your case stored as variables.

这篇关于为什么在检查点上每个层都有两个附加变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆