MaxPooling2D后具有padding ='same'的图像形状-在卷积自动编码器中逐层计算形状 [英] Shape of image after MaxPooling2D with padding ='same' --calculating layer-by-layer shape in convolution autoencoder

查看:1050
本文介绍了MaxPooling2D后具有padding ='same'的图像形状-在卷积自动编码器中逐层计算形状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很简单,我的问题是当我在Keras代码中使用 padding ='same'时,图像尺寸与maxpool层后输入的图像尺寸保持不变。我正在浏览Keras博客:在Keras中构建自动编码器。我正在构建卷积自动编码器。自动编码器代码如下:

Very briefly my question relates to image-size not remaining the same as the input image size after a maxpool layer when I use padding = 'same' in Keras code. I am going through the Keras blog: Building Autoencoders in Keras. I am building Convolution autoencoder. The autoencoder code is as follows:

input_layer = Input(shape=(28, 28, 1))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

根据 autoencoder.summary(),最先 Conv2D(16,(3,3),activation ='relu',padding ='same')(input_layer)层之后的图像输出是28 X 28 X 16,即与输入图像大小相同。这是因为填充是相同

As per autoencoder.summary(), the image output after the very-first Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer) layer is 28 X 28 X 16 ie the same as input image size. This is because padding is 'same'.


In [49]: autoencoder.summary()
(Numbering of layers is given by me and not produced in output)
_________________________________________________________________
  Layer (type)                 Output Shape             Param #   
=================================================================
1.input_1 (InputLayer)         (None, 28, 28, 1)         0         
_________________________________________________________________
2.conv2d_1 (Conv2D)            (None, 28, 28, 16)        160       
_________________________________________________________________
3.max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16)        0         
_________________________________________________________________
4.conv2d_2 (Conv2D)            (None, 14, 14, 8)         1160      
_________________________________________________________________
5.max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8)           0         
_________________________________________________________________
6.conv2d_3 (Conv2D)            (None, 7, 7, 8)           584       
_________________________________________________________________
7.max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8)           0         
_________________________________________________________________
8.conv2d_4 (Conv2D)            (None, 4, 4, 8)           584       
_________________________________________________________________
9.up_sampling2d_1 (UpSampling2 (None, 8, 8, 8)           0         
_________________________________________________________________
10.conv2d_5 (Conv2D)            (None, 8, 8, 8)           584       
_________________________________________________________________
11.up_sampling2d_2 (UpSampling2 (None, 16, 16, 8)         0         
_________________________________________________________________
12.conv2d_6 (Conv2D)            (None, 14, 14, 16)        1168      
_________________________________________________________________
13.up_sampling2d_3 (UpSampling2 (None, 28, 28, 16)        0         
_________________________________________________________________
14.conv2d_7 (Conv2D)            (None, 28, 28, 1)         145       
=================================================================

下一层(第3层) MaxPooling2D((2,2),padding ='same')(x)。 summary()将这一层的输出图像大小显示为14 X 14 X16。但是,在该层中的填充也是相同 。那么输出的图像大小如何不保持为28 X 28 X 16且带有填充的零?

Next layer (layer 3) is, MaxPooling2D((2, 2), padding='same')(x). The summary() shows the output image size of this layer as, 14 X 14 X 16. But padding in this layer is also 'same'. So how come output image-size does not remain as 28 X 28 X 16 with padded zeros?

此外,尚不清楚输出形状如何变为第12层之后的(14 X 14 X 16),则来自其上一层的输入形状为(16 X 16 X 8)。

Also, it is not clear as to how the output shape has changed to (14 X 14 X 16) after layer 12, when input shape coming from above its earlier layer is (16 X 16 X 8).

`

推荐答案


下一层(第3层)是MaxPooling2D((2,2), padding ='same')(x)。 summary()将这一层的输出图像大小显示为14 X 14 X16。但是在这一层中的填充也相同。那么为什么输出的图像大小不保持为28 X 28 X 16且填充了零?

Next layer (layer 3) is, MaxPooling2D((2, 2), padding='same')(x). The summary() shows the output image size of this layer as, 14 X 14 X 16. But padding in this layer is also 'same'. So how come output image-size does not remain as 28 X 28 X 16 with padded zeros?

似乎对填充的作用有误解。填充仅处理角落情况(在图像边界旁边要做的事情)。但是您有2x2 maxpooling操作,并且在Keras中,默认的 stride 等于池大小,因此stride = 2,将图像大小减半。您需要手动指定stride = 1以避免这种情况。从Keras文档中获取:

There seems to be misunderstanding of what padding does. Padding just takes care of corner cases (what to do next to the boundary of the image). But you have 2x2 maxpooling operation, and in Keras the default stride equals to the pooling size, so stride=2, which halves the image size. You need to specify stride=1 by hand to avoid that. From Keras doc:


pool_size:2的整数或元组,用于缩小(垂直,水平)的因子。 (2,2)将在两个空间维度上将输入减半。如果仅指定一个整数,则两个维度都将使用相同的窗口长度。

pool_size: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.

strides:整数,2个整数的元组或无。跨越价值观。 如果为None,则默认为pool_size

strides: Integer, tuple of 2 integers, or None. Strides values. If None, it will default to pool_size.

第二个问题


此外,当来自其上一层的输入形状为(16 X 16 X 8)时,尚不清楚输出形状如何在第12层之后变为(14 X 14 X 16)

Also, it is not clear as to how the output shape has changed to (14 X 14 X 16) after layer 12, when input shape coming from above its earlier layer is (16 X 16 X 8).

第12层未指定padding = same。

Layer 12 does not have padding=same specified.

这篇关于MaxPooling2D后具有padding ='same'的图像形状-在卷积自动编码器中逐层计算形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆