Tensorflow:如何将 numpy 预训练权重分配给图的子部分? [英] Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?

查看:14
本文介绍了Tensorflow:如何将 numpy 预训练权重分配给图的子部分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一件很简单的事情,我就是不知道该怎么做.

This is a simple thing which I just couldn't figure out how to do.

我使用来自 https://github 的 github 代码将预训练的 VGG caffe 模型转换为 tensorflow.com/ethereon/caffe-tensorflow 并将其保存到 vgg16.npy...

I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...

然后我使用以下命令将网络加载到我的 sess 默认会话中作为net":

I then load the network to my sess default session as "net" using:

images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1}) 
with tf.Session() as sess:
  net.load("vgg16.npy", sess) 

在 net.load 之后,我得到一个包含张量列表的图表.我可以使用 net.layers['conv1_1']... 访问每层的单个张量,以获得第一个 VGG 卷积层的权重和偏差等.

After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.

现在假设我制作了另一个图形,其第一层为h_conv1_b":

Now suppose that I make another graph that has as its first layer "h_conv1_b":

  W_conv1_b = weight_variable([3,3,3,64])
  b_conv1_b = bias_variable([64])
  h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)

我的问题是——你如何将来自 net.layers['conv1_1'] 的预训练权重分配给 h_conv1_b ?(现在都是张量)

My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)

推荐答案

我建议你详细看看 network.py 来自 https://github.com/ethereon/caffe-tensorflow,尤其是函数load().这将帮助您了解调用 net.load(weight_path, session) 时发生的情况.

I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).

仅供参考,可以使用在会话中执行的 var.assign(np_array) 将 Tensorflow 中的变量分配给一个 numpy 数组.这是您问题的解决方案:

FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:

with tf.Session() as sess:    
  W_conv1_b = weight_variable([3,3,3,64])
  sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
  b_conv1_b = bias_variable([64])
  sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
  h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)

我想提醒您以下几点:

  1. var.assign(data) 其中 'data' 是一个 numpy 数组,'var' 是一个 TensorFlow 变量,应该在您想继续执行网络的同一会话中执行或培训.
  2. 默认情况下,var"应创建为与data"相同的形状.因此,如果您可以在创建 'var' 之前获取 'data',我建议您通过方法 var=tf.Variable(shape=data.shape) 创建 'var'.否则,您需要通过方法var=tf.Variable(validate_shape=False) 创建'var',这意味着变量形状是可行的.详细说明可以在 Tensorflow 的 API 文档中找到.
  1. var.assign(data) where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.
  2. The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.

我扩展了相同的 repo caffe-tensorflow 以支持 caffe 中的 theano,以便我可以从 Theano 中的 caffe 加载转换后的模型.因此,我是这个 repo 代码的合理专家.如果您有任何其他问题,请随时与我联系.

I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.

这篇关于Tensorflow:如何将 numpy 预训练权重分配给图的子部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆