在密集的 Keras 层中绑定自动编码器权重 [英] Tying Autoencoder Weights in a Dense Keras Layer

查看:33
本文介绍了在密集的 Keras 层中绑定自动编码器权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Keras 中创建一个自定义的 Dense 层来绑定自动编码器中的权重.我已经尝试按照在卷积层中执行此操作的示例 here,但它似乎是步骤不适用于 Dense 层(而且,代码来自两年多前).

I am attempting to create a custom, Dense layer in Keras to tie weights in an Autoencoder. I have tried following an example for doing this in convolutional layers here, but it seemed like some of the steps did not apply for the Dense layer (also, the code is from over two years ago).

通过绑定权重,我希望解码层使用编码层的转置权重矩阵.这篇文章(第 5 页)中也采用了这种方法.以下是文章的相关引述:

By tying weights, I want the decode layer to use the transposed weight matrix of the encode layer. This approach is also taken in this article (page 5). Below is the relevant quote from the article:

这里,我们选择编码和解码激活函数都为 sigmoid 函数,只考虑绑定权重的情况,其中 W ′ = WT(其中 WT是个W ) 的转置作为大多数现有的深度学习方法做.

Here, we choose both the encoding and decoding activation function to be sigmoid function and only consider the tied weights case, in which W ′ = WT (where WT is the transpose of W ) as most existing deep learning methods do.

在上面的引用中,W 是编码层中的权重矩阵,W'(等于 W 的转置)是解码层的权重矩阵.

In the quote above, W is the weight matrix in the encode layer and W' (equal to the transpose of W) is the weight matrix in the decode layer.

我在密集层没有太大变化.我在构造函数中添加了一个 tied_to 参数,它允许您传递要绑定到的层.唯一的其他变化是 build 函数,其代码片段如下:

I did not change too much in the dense layer. I added a tied_to parameter to the constructor, which allows you to pass the layer you want to tie it to. The only other change was to the build function, the snippet for this is below:

def build(self, input_shape):
    assert len(input_shape) >= 2
    input_dim = input_shape[-1]

    if self.tied_to is not None:
        self.kernel = K.transpose(self.tied_to.kernel)
        self._non_trainable_weights.append(self.kernel)
    else:
        self.kernel = self.add_weight(shape=(input_dim, self.units),
                                      initializer=self.kernel_initializer,
                                      name='kernel',
                                      regularizer=self.kernel_regularizer,
                                      constraint=self.kernel_constraint)
    if self.use_bias:
        self.bias = self.add_weight(shape=(self.units,),
                                    initializer=self.bias_initializer,
                                    name='bias',
                                    regularizer=self.bias_regularizer,
                                    constraint=self.bias_constraint)
    else:
        self.bias = None
    self.input_spec = InputSpec(min_ndim=2, axes={-1: input_dim})
    self.built = True

下面是 __init__ 方法,这里唯一的变化是添加了 tied_to 参数.

Below is the __init__ method, the only change here was the addition of the tied_to parameter.

def __init__(self, units,
             activation=None,
             use_bias=True,
             kernel_initializer='glorot_uniform',
             bias_initializer='zeros',
             kernel_regularizer=None,
             bias_regularizer=None,
             activity_regularizer=None,
             kernel_constraint=None,
             bias_constraint=None,
             tied_to=None,
             **kwargs):
    if 'input_shape' not in kwargs and 'input_dim' in kwargs:
        kwargs['input_shape'] = (kwargs.pop('input_dim'),)
    super(Dense, self).__init__(**kwargs)
    self.units = units
    self.activation = activations.get(activation)
    self.use_bias = use_bias
    self.kernel_initializer = initializers.get(kernel_initializer)
    self.bias_initializer = initializers.get(bias_initializer)
    self.kernel_regularizer = regularizers.get(kernel_regularizer)
    self.bias_regularizer = regularizers.get(bias_regularizer)
    self.activity_regularizer = regularizers.get(activity_regularizer)
    self.kernel_constraint = constraints.get(kernel_constraint)
    self.bias_constraint = constraints.get(bias_constraint)
    self.input_spec = InputSpec(min_ndim=2)
    self.supports_masking = True
    self.tied_to = tied_to

call 函数没有修改,但在下面供参考.

The call function was not edited, but it is below for reference.

def call(self, inputs):
    output = K.dot(inputs, self.kernel)
    if self.use_bias:
        output = K.bias_add(output, self.bias, data_format='channels_last')
    if self.activation is not None:
        output = self.activation(output)
    return output

上面,我添加了一个条件来检查 tied_to 参数是否设置,如果设置了,将层的内核设置为 tied_to 层内核的转置.

Above, I added a conditional to check if the tied_to parameter was set, and if so, set the layer's kernel to the transpose of the tied_to layer's kernel.

以下是用于实例化模型的代码.它是使用 Keras 的顺序 API 完成的,DenseTied 是我的自定义层.

Below is the code used to instantiate the model. It is done using Keras's sequential API and DenseTied is my custom layer.

# encoder
#
encoded1 = Dense(2, activation="sigmoid")

decoded1 = DenseTied(4, activation="sigmoid", tied_to=encoded1)

# autoencoder
#
autoencoder = Sequential()
autoencoder.add(encoded1)
autoencoder.add(decoded1)

训练模型后,下面是模型摘要和权重.

After training the model, below is the model summary and weights.

autoencoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_7 (Dense)              (None, 2)                 10        
_________________________________________________________________
dense_tied_7 (DenseTied)     (None, 4)                 12        
=================================================================
Total params: 22
Trainable params: 14
Non-trainable params: 8
________________________________________________________________

autoencoder.layers[0].get_weights()[0]
array([[-2.122982  ,  0.43029135],
       [-2.1772149 ,  0.16689162],
       [-1.0465667 ,  0.9828905 ],
       [-0.6830663 ,  0.0512633 ]], dtype=float32)


autoencoder.layers[-1].get_weights()[1]
array([[-0.6521988 , -0.7131109 ,  0.14814234,  0.26533198],
       [ 0.04387903, -0.22077179,  0.517225  , -0.21583867]],
      dtype=float32)

如您所见,autoencoder.get_weights() 报告的权重似乎没有绑定.

As you can see, the weights reported by autoencoder.get_weights() do not seem to be tied.

所以在展示我的方法之后,我的问题是,这是在 Dense Keras 层中绑定权重的有效方法吗?我能够运行代码,目前正在培训中.损失函数似乎也在合理下降.我担心这只会在构建模型时将它们设置为相等,但实际上不会将它们联系起来.我希望后端 transpose 函数通过幕后的引用将它们联系起来,但我确信我遗漏了一些东西.

So after showing my approach, my question is, is this a valid way to tie weights in a Dense Keras layer? I was able to run the code, and it is currently training. It seems that the loss function is decreasing reasonably as well. My fear is that this will only set them equal when the model is build, but not actually tie them. My hope is that the backend transpose function is tying them through references under the hood, but I am sure that I am missing something.

推荐答案

所以在展示我的方法之后,我的问题是,这是在 Dense Keras 层中绑定权重的有效方法吗?

So after showing my approach, my question is, is this a valid way to tie weights in a Dense Keras layer?

是的,这是有效的.

我担心这只会在构建模型时将它们设置为相等,而实际上不会将它们联系起来.我希望后端转置函数通过引擎盖下的引用将它们联系起来,但我确信我遗漏了一些东西.

My fear is that this will only set them equal when the model is build, but not actually tie them. My hope is that the backend transpose function is tying them through references under the hood, but I am sure that I am missing something.

它实际上将它们绑定在一个计算图中,您可以检查打印 model.summary() 是否只有这些可训练权重的一个副本.此外,在训练模型后,您可以使用 model.get_weights() 检查相应层的权重.在构建模型时,实际上还没有权重,只是它们的占位符.

It actually ties them in a computation graph, you can check in printing model.summary() that there's just one copy of these trainable weights. Also, after training your model you can check weights of corresponding layers with model.get_weights(). When the model is build there're no weights yet actually, just placeholders for them.

random.seed(1)

class DenseTied(Layer):
    def __init__(self, units,
                 activation=None,
                 use_bias=True,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='zeros',
                 kernel_regularizer=None,
                 bias_regularizer=None,
                 activity_regularizer=None,
                 kernel_constraint=None,
                 bias_constraint=None,
                 tied_to=None,
                 **kwargs):
        self.tied_to = tied_to
        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] = (kwargs.pop('input_dim'),)
        super().__init__(**kwargs)
        self.units = units
        self.activation = activations.get(activation)
        self.use_bias = use_bias
        self.kernel_initializer = initializers.get(kernel_initializer)
        self.bias_initializer = initializers.get(bias_initializer)
        self.kernel_regularizer = regularizers.get(kernel_regularizer)
        self.bias_regularizer = regularizers.get(bias_regularizer)
        self.activity_regularizer = regularizers.get(activity_regularizer)
        self.kernel_constraint = constraints.get(kernel_constraint)
        self.bias_constraint = constraints.get(bias_constraint)
        self.input_spec = InputSpec(min_ndim=2)
        self.supports_masking = True

    def build(self, input_shape):
        assert len(input_shape) >= 2
        input_dim = input_shape[-1]

        if self.tied_to is not None:
            self.kernel = K.transpose(self.tied_to.kernel)
            self._non_trainable_weights.append(self.kernel)
        else:
            self.kernel = self.add_weight(shape=(input_dim, self.units),
                                          initializer=self.kernel_initializer,
                                          name='kernel',
                                          regularizer=self.kernel_regularizer,
                                          constraint=self.kernel_constraint)
        if self.use_bias:
            self.bias = self.add_weight(shape=(self.units,),
                                        initializer=self.bias_initializer,
                                        name='bias',
                                        regularizer=self.bias_regularizer,
                                        constraint=self.bias_constraint)
        else:
            self.bias = None

        self.built = True

    def compute_output_shape(self, input_shape):
        assert input_shape and len(input_shape) >= 2
        assert input_shape[-1] == self.units
        output_shape = list(input_shape)
        output_shape[-1] = self.units
        return tuple(output_shape)

    def call(self, inputs):
        output = K.dot(inputs, self.kernel)
        if self.use_bias:
            output = K.bias_add(output, self.bias, data_format='channels_last')
        if self.activation is not None:
            output = self.activation(output)
        return output


# input_ = Input(shape=(16,), dtype=np.float32)
# encoder
#
encoded1 = Dense(4, activation="sigmoid", input_shape=(4,), use_bias=True)
decoded1 = DenseTied(4, activation="sigmoid", tied_to=encoded1, use_bias=False)

# autoencoder
#
autoencoder = Sequential()
# autoencoder.add(input_)
autoencoder.add(encoded1)
autoencoder.add(decoded1)

autoencoder.compile(optimizer="adam", loss="binary_crossentropy")

print(autoencoder.summary())

autoencoder.fit(x=np.random.rand(100, 4), y=np.random.randint(0, 1, size=(100, 4)))

print(autoencoder.layers[0].get_weights()[0])
print(autoencoder.layers[1].get_weights()[0])

这篇关于在密集的 Keras 层中绑定自动编码器权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆