为什么要对最大池使用相同的填充? [英] Why use same padding with max pooling?

查看:61
本文介绍了为什么要对最大池使用相同的填充?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在浏览 Keras博客中的自动编码器教程时,我看到了如下所示,作者在卷积自动编码器部分的最大池层中使用了相同的填充.

While going through the autoencoder tutorial in Keras blog, I saw that the author uses same padding in max pooling layers in Convolutional Autoencoder part, as shown below.

x = MaxPooling2D((2, 2), padding='same')(x)

有人可以解释这背后的原因吗?对于最大池,我们想减小高度和宽度,但是为什么要在此处使用相同的填充(保持高度和宽度相同)呢?

Could someone explain the reason behind this? With max pooling, we want to reduce the height and width but why is same padding, which keeps height and width the same, used here?

此外,此代码的结果将尺寸减半为2,因此相同的填充似乎不起作用.

In addition, the result of this code halves the dimensions by 2, so the same padding doesn't seem to work.

推荐答案

来自 https://keras.io /layers/convolutional/

相同"导致填充输入,以便输出具有相同的 长度作为原始输入.

"same" results in padding the input such that the output has the same length as the original input.

来自 https://keras.io/layers/pooling/

pool_size:2个整数的整数或元组,用于缩减的因子(垂直,水平). (2,2)将在两个空间维度上将输入减半.如果仅指定一个整数,则两个尺寸将使用相同的窗口长度.

pool_size: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.

因此,首先让我们开始问为什么要完全使用填充?在卷积内核上下文中,这很重要,因为我们不想错过每个位于内核中心"的像素.内核正在寻找的图像的边缘/角落可能有重要的行为.因此,我们在Conv2D的边缘上进行填充,结果返回与输入相同的大小.

So, first let's start by asking why use padding at all? In the convolutional kernel context it is important since we don't want to miss each pixel being at the "center" of the kernel. There could be important behavior at the edges/corners of the image that a kernel is looking for. So we pad around the edges for Conv2D and as a result it returns the same size output as the input.

但是,对于MaxPooling2D图层,我们出于类似原因进行填充,但是步幅大小受池大小选择的影响.由于您的合并大小为2,所以每次您通过合并层时,图像都会减半.

However, in the case of the MaxPooling2D layer we are padding for similar reasons, but the stride size is affected by your choice of pooling size. Since your pooling size is 2, your image will be halved each time you go through a pooling layer.

input_img = Input(shape=(28, 28, 1))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

因此,以您的教程示例为例;您的图片尺寸将从28-> 14-> 7-> 4变为每个箭头代表合并层.

So in the case of your tutorial example; your image dimensions will go from 28->14->7->4 with each arrow representing the pooling layer.

这篇关于为什么要对最大池使用相同的填充?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆