将权重从一个Conv2D图层复制到另一层 [英] Copying weights from one Conv2D layer to another

查看:59
本文介绍了将权重从一个Conv2D图层复制到另一层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用Keras在MNIST上训练了一个模型.我的目标是在第一层之后打印图像,第一层是Conv2D层.为此,我将创建一个具有单个Conv2D层的新模型,在该模型中,我将从受过训练的网络中将权重复制到新的模型中.

I have trained a model on MNIST using Keras. My goal is to print images after the first layer with the first layer being a Conv2D layer. To go about this I'm creating a new model with a single Conv2D layer in which I'll copy the weights from the trained network into the new one.

# Visualization for image ofter first convolution
model_temp = Sequential()
model_temp.add(Conv2D(32, (3, 3),
                         activation='relu', 
                         input_shape=(28,28,1,)))

trained_weights = model.layers[0].get_weights()[0]

model_temp.layers[0].set_weights(trained_weights)

activations = model_temp._predict(X_test)

变量model保存来自整个网络的训练数据.另外,Conv2D的输入参数与原始模型中的参数完全相同.

The variable model holds the trained data from the full network. Also, the input parameters to Conv2D are exactly the same as the ones in the original model.

我已经检查了modelmodel_temp的两个权重的形状,并且都返回为(3, 3, 1, 32).从理论上讲,我应该能够从原始权重中获取权重,并将其直接输入到新模型中单个Conv2D层上的set_weights()调用中.

I have checked the shape of both the weights for model and model_temp and both return as (3, 3, 1, 32). In theory I should be able to get the weights from the original and input them directly into the set_weights() call on the single Conv2D layer in the new model.

卷积之后,名为"activations"的变量将是一个张量,该张量可容纳每个输入图像的32(层),26 x 26矩阵的输出值.

After this convolution, variable named 'activations' would be a tensor that holds 32 (layers), 26 by 26 matrices of output values for each input image.

因此,当我运行此代码时,会出现此错误:

So when I run this code, I get this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-152-4ae260f0fe89> in <module>()
      7 trained_weights = model.layers[0].get_weights()[0]
      8 print(trained_weights.shape)
----> 9 model_test = model_test.layers[0].set_weights(trained_weights)
     10 
     11 activations = model_test._predict(X_test[1, 28, 28, 1])

/usr/local/lib/python2.7/dist-packages/keras/engine/topology.pyc in set_weights(self, weights)
   1189                              str(len(params)) +
   1190                              ' weights. Provided weights: ' +
-> 1191                              str(weights)[:50] + '...')
   1192         if not params:
   1193             return

ValueError: You called `set_weights(weights)` on layer "conv2d_60" with a  weight list of length 3, but the layer was expecting 2 weights. Provided weights: [[[[ -6.22274876e-01  -2.18614027e-01   5.29607059...

在最后一行,为什么set_weights(weights)寻找的是两个而不是三个的长度?此错误消息对我来说有点晦涩难懂,因此,如果长度不等于2,那么期望两个权重"是什么意思?

On the last line, why is set_weights(weights) looking for a length of two instead of three? This error message is slightly cryptic to me so if not a length of two what does "expecting two weights" mean?

我也乐意接受有关更简单方法的建议.

Also i'm open to suggestions on an easier way to go about this.

在检查源代码后,get_weights()(第1168行),此部分会引发错误:

After inspecting the source code for get_weights() (line 1168), the error is raised in this section:

 params = self.weights
    if len(params) != len(weights):
        raise ValueError('You called `set_weights(weights)` on layer "' +
                         self.name +
                         '" with a  weight list of length ' +
                         str(len(weights)) +
                         ', but the layer was expecting ' +
                         str(len(params)) +
                         ' weights. Provided weights: ' +
                         str(weights)[:50] + '...')

此条件检查确定我传入的内容(上面的(3, 3, 1, 32)张量)的长度是否等于此类的weights属性.因此,我对这些属性进行了如下测试:

This condition check determines if the length of what I passed in (the (3, 3, 1, 32) tensor from above) is equivalent to the weights property of this class. So I tested these properties as follows:

# Print contents of weights property
print(model.layers[0].weights)
print(model_test.layers[0].weights)

# Length test of tensors from get_weights call
len_test  = len(model.layers[0].get_weights()[0])
len_test2 = len(model_test.layers[0].get_weights()[0])
print("\nLength get_weights():")
print("Trained Model: ", len_test, "Test Model: ", len_test2)

# Length test of wights attributes from both models
len_test3 = len(model.layers[0].weights)
len_test4 = len(model_test.layers[0].weights)
print("\nLength weights attribute:")
print("Trained Model: ", len_test3, "Test Model: ", len_test4)

输出:

[<tf.Variable 'conv2d_17/kernel:0' shape=(3, 3, 1, 32) dtype=float32_ref>,         <tf.Variable 'conv2d_17/bias:0' shape=(32,) dtype=float32_ref>]
[<tf.Variable 'conv2d_97/kernel:0' shape=(3, 3, 1, 32) dtype=float32_ref>, <tf.Variable 'conv2d_97/bias:0' shape=(32,) dtype=float32_ref>]

Length get_weights():
('Trained Model: ', 3, 'Test Model: ', 3)

Length weights attribute:
('Trained Model: ', 2, 'Test Model: ', 2)

这个输出对我来说是百分百的,因为每个模型中的这些卷积的构造都完全相同.现在也很明显为什么它想要两个长度.这是因为weights属性是tf.Variable的两个元素的列表.

This output makes one hundred percent sense to me as these convolutions in each model are constructed exactly the same. It's also now obvious why it wants a length of two. This is because the weights property is a list of two elements of tf.Variable.

进一步研究此源文件,在第213行,我们看到权重持有列表trainable_weights和non_trainable_weights的串联(按此顺序)".

Further investigating this source file, at line 213 we see that weights holds "The concatenation of the lists trainable_weights and non_trainable_weights (in this order)".

因此,请确保我可以从原始训练后的模型的Conv2D层中获取weights属性,并将其传递来满足此条件,但是此条件根本不检查所传递数据的形状.如果我确实从原始模型传递了权重,我会从numpy中收到setting an array element with a sequence错误.

So then sure I can grab the weights attribute from the Conv2D layer of the original trained model and pass that in to satisfy this condition but then this condition isn't checking the shape of the passed in data at all. If I do pass in weights from my original model I get a setting an array element with a sequence error from numpy.

我认为这是源代码中的错误.如果有人可以验证这一点,我将非常棒.

I think this is a bug in the source code. I would be awesome if someone could verify this.

推荐答案

您忘记了偏差向量. conv2d的Get_weights()和set_weights()函数返回一个列表,其中权重矩阵为第一个元素,偏差向量为第二个.因此错误正确地表明它期望包含2个成员的列表.因此,应该执行以下操作

You are forgetting about bias vectors. Get_weights() and set_weights() functions for conv2d returns a list with weights matrix as first element and bias vector as second. So the error rightly suggests it expects a list with 2 members. Doing the following should thus work

trained_weights = model.layers[0].get_weights()
model_temp.layers[0].set_weights(trained_weights)

此外,如果您想从中间层获取输出,则无需手动传输权重.像跟随这样的事情要方便得多

Also if you want to get output from an intermediate layer you dont need to manually transfer weights. Doing something like following is much more convenieant

get_layer_output = K.function([model.input],
                                  [model.layers[0].output])
layer_output = get_layer_output([x])[0]

intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)

这篇关于将权重从一个Conv2D图层复制到另一层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆