在 tensorflow Conv2D 中 padding='same' 到底是什么意思?是最小填充还是 input_shape == output_shape [英] What does padding='same' exactly mean in tensorflow Conv2D? Is it minimum padding or input_shape == output_shape

查看:202
本文介绍了在 tensorflow Conv2D 中 padding='same' 到底是什么意思?是最小填充还是 input_shape == output_shape的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TL;DR:如何修改下面给出的代码以合并 padding = 'same' 方法?

TL;DR: How can I modify my code given below to incorporate the padding = 'same' method?

我试图使用 numpy 构建自己的 CNN,但由于 padding = 'same' 的两个答案而感到困惑.

I was trying to build my own CNN using numpy and got confused due to the two answers for padding = 'same'.

这个答案说

padding='Same' 在 Keras 中意味着当输入大小和内核大小不完美匹配时,根据需要添加填充以弥补重叠

padding='Same' in Keras means padding is added as required to make up for overlaps when the input size and kernel size do not perfectly fit

因此,根据此,same 表示每个方向所需的最小填充.如果是这样的话,这不应该是双方平等的吗?或者,如果 最小 所需的填充是 2,那么这不应该是填充在所有 4 个边上均匀分布的有效候选者.如果所需的填充只有 3 怎么办?那会发生什么?

So according to this, same means the Minumum padding required in each direction. If that's the case, shouldn't this be equally on both sides? Or if the minimum required padding was 2, shouldn't that be a valid candidate for padding to be distributed equally on all of the 4 sides. What if required padding was just 3? What happens then?

另外,让我烦恼的是tensorflow的官方文档 他们说:

Also, what bothers me is the official documentation of tensorflow where they say:

相同"导致在输入的左/右或上/下均匀填充零,以便输出具有与输入相同的高度/宽度尺寸.

"same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.

那么正确答案是什么?

这是我为填充而编写的代码

Here is the code that I have written for padding

def add_padding(X:np.ndarray, pad_size:Union[int,list,tuple], pad_val:int=0)->np.ndarray:
    '''
    Pad the input image array equally from all sides
    args:
        x: Input Image should be in the form of [Batch, Width, Height, Channels]
        pad_size: How much padding should be done. If int, equal padding will done. Else specify how much to pad each side (height_pad,width_pad) OR (y_pad, x_pad)
        pad_val: What should be the value to be padded. Usually it os 0 padding
    return:
        Padded Numpy array Image
    '''
    assert (len(X.shape) == 4), "Input image should be form of [Batch, Width, Height, Channels]"
    if isinstance(pad_size,int):
        y_pad = x_pad = pad_size
    else:
        y_pad = pad_size[0]
        x_pad = pad_size[1]

    pad_width = ((0,0), (y_pad,y_pad), (x_pad,x_pad), (0,0)) # Do not pad first and last axis. Pad Width(2nd), Height(3rd) axis with  pad_size
    return np.pad(X, pad_width = pad_width, mode = 'constant', constant_values = (pad_val,pad_val))


# Another part of my Layer
# New Height/Width is dependent on the old height/ width, stride, filter size, and amount of padding
h_new = int((h_old + (2 * padding_size) - filter_size) / self.stride) + 1
w_new = int((w_old + (2 * padding_size) - filter_size) / self.stride) + 1

完整代码这一层呈现在这里

推荐答案

根据这个 SO 答案,名称 'SAME' 填充只是来自当 stride 等于 1 时的属性,输出空间形状与输入空间形状相同.

According to this SO answer, the name 'SAME' padding just came from the property that when stride equals 1, output spatial shape is the same as input spatial shape.

然而,当 stride 不等于 1 时,情况就不是这样了.输出空间形状由以下公式确定.

However, that is not the case when stride doesn't equal one. The output spatial shape is determined by the following formula.

对于所有情况,SAME"的定义意味着以张量流方式应用填充使得

For all cases, the definition of 'SAME' means to apply the padding in a tensorflow way such that

对于每个空间维度 i,
output_spatial_shape[i] = ceil(input_spatial_shape[i]/strides[i])

For each spatial dimension i,
output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])

那么 tensorflow 应用填充的方式是什么?

So what is the tensorflow way to apply the padding?

首先,每个空间维度所需的填充由以下算法确定.

First, the paddings needed for each of the spatial dimensions are determined by the following algorithm.

#e.g. for 2D image, num_spatial_dim=2
def get_padding_needed(input_spatial_shape,filter_shape,strides):
  num_spatial_dim=len(input_spatial_shape)
  padding_needed=[0]*num_spatial_dim

  for i in range(num_spatial_dim):
    if input_spatial_shape[i] % strides[i] == 0:
      padding_needed[i] = max(filter_shape[i]-strides[i],0)
    else:
      padding_needed[i] = max(filter_shape[i]-(input_spatial_shape[i]%strides[i]),0)

  return padding_needed

#example
print(get_padding_needed(input_spatial_shape=[2000,125],filter_shape=[8,4],strides=[4,1]))
#[4,3]

如您所见,第一个空间维度所需的填充是偶数 4.这很简单,只需在第一个空间维度的每一端填充 2 个零即可.

As you can see, the padding needed for the first spatial dimension is a even number 4. That's simple, just pad 2 zeros at each end of the first spatial dimension.

其次,第二维所需的填充是奇数.然后,tensorflow 将在起始端填充更少的零.

Second, the padding needed for the second dimension is an odd number. Then, tensorflow will pad fewer zeros at the starting end.

换句话说,如果维度是高度并且需要的填充是 3,它将在顶部填充 1 个零,在底部填充 2 个零.如果维度是宽度,需要的填充是 5,它会在左边填充 2 个零,在右边填充 3 个零,依此类推.

In other words, if the dimension is height and padding needed is 3, it will pad 1 zero at the top and 2 zeros at the bottom. If the dimension is width, and padding needed is 5, it will pad 2 zeros at the left and 3 zeros at the right ,etc.

参考文献:

  1. https://www.tensorflow.org/api_docs/python/tf/nn/卷积
  2. https://mmuratarat.github.io/2019-01-17/implementing-padding-schemes-of-tensorflow-in-python

这篇关于在 tensorflow Conv2D 中 padding='same' 到底是什么意思?是最小填充还是 input_shape == output_shape的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆