Keras:在不同模型中使用同一图层(权重) [英] Keras: Use the same layer in different models (share weights)
问题描述
快速解答:
这实际上很容易. 这是代码(对于那些不想阅读所有文本的人):
This is in fact really easy. Here's the code (for those who don't want to read all that text):
inputs=Input((784,))
encode=Dense(10, input_shape=[784])(inputs)
decode=Dense(784, input_shape=[10])
model=Model(input=inputs, output=decode(encode))
inputs_2=Input((10,))
decode_model=Model(input=inputs_2, output=decode(inputs_2))
在此设置中,decode_model
将使用与model
相同的解码层.
如果您训练model
,decode_model
也将被训练.
In this setup, the decode_model
will use the same decode layer as the model
.
If you train the model
, the decode_model
will be trained, too.
实际问题:
我正在尝试在Keras中为MNIST创建一个简单的自动编码器:
I'm trying to create a simple autoencoder for MNIST in Keras:
这是到目前为止的代码:
This is the code so far:
model=Sequential()
encode=Dense(10, input_shape=[784])
decode=Dense(784, input_shape=[10])
model.add(encode)
model.add(decode)
model.compile(loss="mse",
optimizer="adadelta",
metrics=["accuracy"])
decode_model=Sequential()
decode_model.add(decode)
我正在训练它以学习身份功能
I'm training it to learn the identity function
model.fit(X_train,X_train,batch_size=50, nb_epoch=10, verbose=1,
validation_data=[X_test, X_test])
重建非常有趣:
但是我也想看看集群的表示. 将[1,0 ... 0]传递到解码层的输出是什么? 这应该是MNIST中一类的均值".
But I would also like to look at the representations of cluster. What is the output of passing [1,0...0] to the decoding layer ? This should be the "cluster-mean" of one class in MNIST.
为此,我创建了第二个模型decode_model
,该模型重用了解码器层.
但是,如果我尝试使用该模型,它会抱怨:
In order to do that I created a second model decode_model
, which reuses the decoder layer.
But if I try to use that model, it complains:
异常:检查时出错:预期density_input_5具有形状(None,784)但具有形状(10,10)的数组
Exception: Error when checking : expected dense_input_5 to have shape (None, 784) but got array with shape (10, 10)
这似乎很奇怪.它只是一个密集的层,矩阵甚至无法处理784个昏暗的输入. 我决定看一下模型摘要:
That seemed strange. It's simply a dense layer, the Matrix wouldn't even be able to process 784-dim input. I decided to look at the model summary:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_14 (Dense) (None, 784) 8624 dense_13[0][0]
====================================================================================================
Total params: 8624
它已连接到density_13. 跟踪层的名称很困难,但这看起来像编码器层.当然,整个模型的模型摘要是:
It is connected to dense_13. It's difficult to keep track of the names of the layers, but that looks like the encoder layer. Sure enough, the model summary of the whole model is:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_13 (Dense) (None, 10) 7850 dense_input_6[0][0]
____________________________________________________________________________________________________
dense_14 (Dense) (None, 784) 8624 dense_13[0][0]
====================================================================================================
Total params: 16474
____________________
显然,这些层是永久连接的.
奇怪的是,我的decode_model
中没有输入层.
Apparently the layers are permanently connected.
Strangely there is no input layer in my decode_model
.
如何在Keras中重用图层? 我看过功能性API,但是那里的层也融合在一起了.
How can I reuse a layer in Keras ? I've looked at the functional API, but there too, layers are fused together.
推荐答案
哦,没关系.
我应该已经阅读了完整的功能性API: https://keras.io/getting-started/functional-api-指南/#shared-layers
I should have read the entire functional API: https://keras.io/getting-started/functional-api-guide/#shared-layers
以下是其中一项预测(可能仍然缺少一些培训):
Here's one of the predictions (maybe still lacking some training):
我猜这可能是3? 至少现在可以了.
I'm guessing this could be a 3 ? Well at least it works now.
对于那些有类似问题的人, 这是更新的代码:
And for those with similar problems, here's the updated code:
inputs=Input((784,))
encode=Dense(10, input_shape=[784])(inputs)
decode=Dense(784, input_shape=[10])
model=Model(input=inputs, output=decode(encode))
model.compile(loss="mse",
optimizer="adadelta",
metrics=["accuracy"])
inputs_2=Input((10,))
decode_model=Model(input=inputs_2, output=decode(inputs_2))
我只编译了其中一个模型. 对于训练,您需要编译模型,对于不必要的预测.
I only compiled one of the models. For training you need to compile a model, for prediction that is not necessary.
这篇关于Keras:在不同模型中使用同一图层(权重)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!