了解keras Conv2DTranspose的输出形状 [英] understanding output shape of keras Conv2DTranspose

查看：1725 发布时间：2020/4/30 7:20:56 keras layer shapes

本文介绍了了解keras Conv2DTranspose的输出形状的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我很难理解keras.layers.Conv2DTranspose的输出形状

I am having a hard time understanding the output shape of keras.layers.Conv2DTranspose

这是原型:

keras.layers.Conv2DTranspose(
    filters,
    kernel_size,
    strides=(1, 1),
    padding='valid',
    output_padding=None,
    data_format=None,
    dilation_rate=(1, 1),
    activation=None,
    use_bias=True,
    kernel_initializer='glorot_uniform',
    bias_initializer='zeros',
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None
)

在文档中( https://keras.io/layers/convolutional/)，我读到:

In the documentation (https://keras.io/layers/convolutional/), I read:

If output_padding is set to None (default), the output shape is inferred.

在代码中( https://github .com/keras-team/keras/blob/master/keras/layers/convolutional.py )，我读到:

In the code (https://github.com/keras-team/keras/blob/master/keras/layers/convolutional.py), I read:

out_height = conv_utils.deconv_length(height,
                                      stride_h, kernel_h,
                                      self.padding,
                                      out_pad_h,
                                      self.dilation_rate[0])
out_width = conv_utils.deconv_length(width,
                                     stride_w, kernel_w,
                                     self.padding,
                                     out_pad_w,
                                     self.dilation_rate[1])
if self.data_format == 'channels_first':
    output_shape = (batch_size, self.filters, out_height, out_width)
else:
    output_shape = (batch_size, out_height, out_width, self.filters)

和( https://github.com /keras-team/keras/blob/master/keras/utils/conv_utils.py ):

def deconv_length(dim_size, stride_size, kernel_size, padding, output_padding, dilation=1):

    """Determines output length of a transposed convolution given input length.
    # Arguments
        dim_size: Integer, the input length.
        stride_size: Integer, the stride along the dimension of `dim_size`.
        kernel_size: Integer, the kernel size along the dimension of `dim_size`.
        padding: One of `"same"`, `"valid"`, `"full"`.
        output_padding: Integer, amount of padding along the output dimension, can be set to `None` in which case the output length is inferred.
        dilation: dilation rate, integer.
    # Returns
        The output length (integer).
    """

    assert padding in {'same', 'valid', 'full'}
    if dim_size is None:
        return None

    # Get the dilated kernel size
    kernel_size = kernel_size + (kernel_size - 1) * (dilation - 1)

    # Infer length if output padding is None, else compute the exact length
    if output_padding is None:
        if padding == 'valid':
            dim_size = dim_size * stride_size + max(kernel_size - stride_size, 0)
        elif padding == 'full':
            dim_size = dim_size * stride_size - (stride_size + kernel_size - 2)
        elif padding == 'same':
            dim_size = dim_size * stride_size
    else:
        if padding == 'same':
            pad = kernel_size // 2
        elif padding == 'valid':
            pad = 0
        elif padding == 'full':
            pad = kernel_size - 1

        dim_size = ((dim_size - 1) * stride_size + kernel_size - 2 * pad + output_padding)

    return dim_size

我了解Conv2DTranspose有点像Conv2D，但是相反.

I understand that Conv2DTranspose is kind of a Conv2D, but reversed.

由于将Conv2D的kernel_size =(3，3)，步幅=(10，10)和padding ="same"应用于200x200图像，将输出20x20图像，我假设将Conv2DTranspose的kernel_size =(3，3)，步幅=(10，10)和padding ="same"应用于20x20图像将输出200x200图像.

Since applying a Conv2D with kernel_size = (3, 3), strides = (10, 10) and padding = "same" to a 200x200 image will output a 20x20 image, I assume that applying a Conv2DTranspose with kernel_size = (3, 3), strides = (10, 10) and padding = "same" to a 20x20 image will output a 200x200 image.

此外，将Conv2D的kernel_size =(3，3)，步幅=(10，10)和padding ="same"应用于195x195图像也将输出20x20图像.

Also, applying a Conv2D with kernel_size = (3, 3), strides = (10, 10) and padding = "same" to a 195x195 image will also output a 20x20 image.

因此，我知道在应用带有kernel_size =(3，3)，步幅=(10，10)和padding ="same"的Conv2DTranspose时，输出形状上存在某种歧义(用户可能希望输出到是195x195或200x200或许多其他兼容的形状.

So, I understand that there is kind of an ambiguity on the output shape when applying a Conv2DTranspose with kernel_size = (3, 3), strides = (10, 10) and padding = "same" (user might want output to be 195x195, or 200x200, or many other compatible shapes).

我假设推断出输出形状".表示根据图层的参数计算出默认的输出形状，并且我假设有一种机制可以根据需要指定与默认形状不同的输出形状.

I assume that "the output shape is inferred." means that a default output shape is computed according to the parameters of the layer, and I assume that there is a mechanism to specify an output shape differnet from the default one, if necessary.

这说，我不太了解

"output_padding"参数的含义

the meaning of the "output_padding" parameter

参数"padding"和"output_padding"之间的相互作用

the interactions between parameters "padding" and "output_padding"

函数keras.conv_utils.deconv_length中的各种公式

the various formulas in the function keras.conv_utils.deconv_length

有人可以解释吗?

非常感谢，

朱利安

了解keras Conv2DTranspose的输出形状 [英] understanding output shape of keras Conv2DTranspose

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

了解keras Conv2DTranspose的输出形状 [英] understanding output shape of keras Conv2DTranspose

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭