我可以在 keras 层之间共享权重但其他参数不同吗? [英] Can I share weights between keras layers but have other parameters differ?

查看:27
本文介绍了我可以在 keras 层之间共享权重但其他参数不同吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 keras 中,是否可以在两层之间共享权重,但其他参数不同?考虑以下(诚然有点做作)示例:

In keras, is it possible to share weights between two layers, but to have other parameters differ? Consider the following (admittedly a bit contrived) example:

conv1 = Conv2D(64, 3, input_shape=input_shape, padding='same')
conv2 = Conv2D(64, 3, input_shape=input_shape, padding='valid')

请注意,除了 padding 之外,层是相同的.我可以让 keras 对两者使用相同的权重吗?(即也相应地训练网络?)

Notice that the layers are identical except for the padding. Can I get keras to use the same weights for both? (i.e. also train the network accordingly?)

我查看了 keras 文档和 部分共享层似乎暗示只有当层完全相同时共享才有效.

I've looked at the keras doc, and the section on shared layers seems to imply that sharing works only if the layers are completely identical.

推荐答案

据我所知,Keras 使用的常见API 级别"无法做到这一点.但是,如果您深入挖掘,就会发现一些(丑陋的)共享权重的方法.

To my knowledge, this cannot be done by the common "API level" of Keras usage. However, if you dig a bit deeper, there are some (ugly) ways to share the weights.

首先,Conv2D 层的权重是在 build() 函数内部通过调用 add_weight() 创建的:

First of all, the weights of the Conv2D layers are created inside the build() function, by calling add_weight():

    self.kernel = self.add_weight(shape=kernel_shape,
                                  initializer=self.kernel_initializer,
                                  name='kernel',
                                  regularizer=self.kernel_regularizer,
                                  constraint=self.kernel_constraint)

对于您提供的用法(即默认trainable/constraint/regularizer/initializer),add_weight() 除了将权重变量附加到 _trainable_weights 之外没有什么特别的:

For your provided usage (i.e., default trainable/constraint/regularizer/initializer), add_weight() does nothing special but appending the weight variables to _trainable_weights:

    weight = K.variable(initializer(shape), dtype=dtype, name=name)
    ...
        self._trainable_weights.append(weight)

最后,由于 build() 仅在 __call__() 内部调用,如果层尚未构建,层之间的共享权重可以通过以下方式创建:

Finally, since build() is only called inside __call__() if the layer hasn't been built, shared weights between layers can be created by:

  1. 调用 conv1.build() 来初始化要共享的 conv1.kernelconv1.bias 变量.
  2. 调用 conv2.build() 来初始化层.
  3. conv2.kernelconv2.bias 替换为 conv1.kernelconv1.bias.
  4. conv2._trainable_weights 中删除 conv2.kernelconv2.bias.
  5. conv1.kernelconv1.bias 附加到 conv2._trainable_weights.
  6. 完成模型定义.这里 conv2.__call__() 将被调用;但是,由于 conv2 已经构建,因此不会重新初始化权重.
  1. Call conv1.build() to initialize the conv1.kernel and conv1.bias variables to be shared.
  2. Call conv2.build() to initialize the layer.
  3. Replace conv2.kernel and conv2.bias by conv1.kernel and conv1.bias.
  4. Remove conv2.kernel and conv2.bias from conv2._trainable_weights.
  5. Append conv1.kernel and conv1.bias to conv2._trainable_weights.
  6. Finish model definition. Here conv2.__call__() will be called; however, since conv2 has already been built, the weights are not going to be re-initialized.

以下代码片段可能会有所帮助:

The following code snippet may be helpful:

def create_shared_weights(conv1, conv2, input_shape):
    with K.name_scope(conv1.name):
        conv1.build(input_shape)
    with K.name_scope(conv2.name):
        conv2.build(input_shape)
    conv2.kernel = conv1.kernel
    conv2.bias = conv1.bias
    conv2._trainable_weights = []
    conv2._trainable_weights.append(conv2.kernel)
    conv2._trainable_weights.append(conv2.bias)

# check if weights are successfully shared
input_img = Input(shape=(299, 299, 3))
conv1 = Conv2D(64, 3, padding='same')
conv2 = Conv2D(64, 3, padding='valid')
create_shared_weights(conv1, conv2, input_img._keras_shape)
print(conv2.weights == conv1.weights)  # True

# check if weights are equal after model fitting
left = conv1(input_img)
right = conv2(input_img)
left = GlobalAveragePooling2D()(left)
right = GlobalAveragePooling2D()(right)
merged = concatenate([left, right])
output = Dense(1)(merged)
model = Model(input_img, output)
model.compile(loss='binary_crossentropy', optimizer='adam')

X = np.random.rand(5, 299, 299, 3)
Y = np.random.randint(2, size=5)
model.fit(X, Y)
print([np.all(w1 == w2) for w1, w2 in zip(conv1.get_weights(), conv2.get_weights())])  # [True, True]

这种笨拙的权重共享的一个缺点是,在模型保存/加载后,权重不会保持共享.这不会影响预测,但如果你想加载训练好的模型进行进一步的微调,可能会出现问题.

One drawback of this hacky weight-sharing is that the weights will not remain shared after model saving/loading. This will not affect prediction, but it may be problematic if you want to load the trained model for further fine-tuning.

这篇关于我可以在 keras 层之间共享权重但其他参数不同吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆