Tensorflow:如何将 numpy 预训练权重分配给图的子部分? [英] Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?
问题描述
这是一件很简单的事情,我就是不知道该怎么做.
This is a simple thing which I just couldn't figure out how to do.
我使用来自 https://github 的 github 代码将预训练的 VGG caffe 模型转换为 tensorflow.com/ethereon/caffe-tensorflow 并将其保存到 vgg16.npy...
I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...
然后我使用以下命令将网络加载到我的 sess 默认会话中作为net":
I then load the network to my sess default session as "net" using:
images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1})
with tf.Session() as sess:
net.load("vgg16.npy", sess)
在 net.load 之后,我得到一个包含张量列表的图表.我可以使用 net.layers['conv1_1']... 访问每层的单个张量,以获得第一个 VGG 卷积层的权重和偏差等.
After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.
现在假设我制作了另一个图形,其第一层为h_conv1_b":
Now suppose that I make another graph that has as its first layer "h_conv1_b":
W_conv1_b = weight_variable([3,3,3,64])
b_conv1_b = bias_variable([64])
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
我的问题是——你如何将来自 net.layers['conv1_1'] 的预训练权重分配给 h_conv1_b ?(现在都是张量)
My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)
推荐答案
我建议你详细看看 network.py 来自 https://github.com/ethereon/caffe-tensorflow,尤其是函数load()
.这将帮助您了解调用 net.load(weight_path, session)
时发生的情况.
I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load()
. It would help you understand what happened when you called net.load(weight_path, session)
.
仅供参考,可以使用在会话中执行的 var.assign(np_array)
将 Tensorflow 中的变量分配给一个 numpy 数组.这是您问题的解决方案:
FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array)
which is executed in the session. Here is the solution to your question:
with tf.Session() as sess:
W_conv1_b = weight_variable([3,3,3,64])
sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
b_conv1_b = bias_variable([64])
sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)
我想提醒您以下几点:
var.assign(data)
其中 'data' 是一个 numpy 数组,'var' 是一个 TensorFlow 变量,应该在您想继续执行网络的同一会话中执行或培训.- 默认情况下,var"应创建为与data"相同的形状.因此,如果您可以在创建 'var' 之前获取 'data',我建议您通过方法
var=tf.Variable(shape=data.shape)
创建 'var'.否则,您需要通过方法var=tf.Variable(validate_shape=False)
创建'var',这意味着变量形状是可行的.详细说明可以在 Tensorflow 的 API 文档中找到.
var.assign(data)
where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.- The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method
var=tf.Variable(shape=data.shape)
. Otherwise, you need to create the 'var' by the methodvar=tf.Variable(validate_shape=False)
, which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.
我扩展了相同的 repo caffe-tensorflow 以支持 caffe 中的 theano,以便我可以从 Theano 中的 caffe 加载转换后的模型.因此,我是这个 repo 代码的合理专家.如果您有任何其他问题,请随时与我联系.
I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.
这篇关于Tensorflow:如何将 numpy 预训练权重分配给图的子部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!