如何更新反卷积层的权重? [英] How to update the weights of a Deconvolutional Layer?

查看:417
本文介绍了如何更新反卷积层的权重?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试开发一个反卷积层(或者确切地说是一个转置的卷积层).

I'm trying to develop a deconvolutional layer (or a transposed convolutional layer to be precise).

在前向传递中,我进行了一次完整的卷积(填充为零的卷积) 在向后传递中,我进行了有效的卷积(无填充的卷积)以将错误传递到上一层

In the forward pass, I do a full convolution (convolution with zero padding) In the backward pass, I do a valid convolution (convolution without padding) to pass the errors to the previous layer

易于计算偏差的梯度,只需对多余尺寸进行平均即可.

The gradients of the biases are easy to compute, simply a matter of averaging over the superfluous dimensions.

问题是我不知道如何更新卷积滤波器的权重.什么是渐变?我确定这是一次卷积运算,但我不知道如何进行.我尝试对输入进行有效卷积,但有错误,但无济于事.

The problem is I don't know how to update the weights of the convolutional filters. What are the gradients ? I'm sure it is a convolution operation but I don't see how. I tried a valid convolution of the inputs with the errors but to no avail.

推荐答案

解卷积说明

首先,反卷积卷积层,仅用于不同的目的,即 upsampling (

Deconvolution explained

First of all, deconvolution is a convolutional layer, only used for a different purpose, namely upsampling (why it's useful is explained in this paper).

例如,此处将2x2输入图像(蓝色底部图像)上采样为4x4(绿色顶部图像):

For example, here a 2x2 input image (bottom image in blue) is upsampled to 4x4 (top image in green):

为使其有效卷积,首先填充输入以使其为6x6,然后不加跨步地应用3x3过滤器.就像在普通的卷积层中一样,您可以选择不同的填充/跨距策略来生成所需的图像尺寸

To make it a valid convolution, the input is first padded to make it 6x6, after which 3x3 filter is applied without striding. Just like in ordinary convolutional layer, you can choose different padding/striding strategies to produce the image size you want.

现在应该清楚的是,反卷积的反向传递是卷积层的反向传递的部分情况,具有特定的步幅和填充.我认为您已经完成了,但是对于任何跨度和填充,这都是一个幼稚的(不是很有效)实现:

Now it should be clear that backward pass for deconvolution is a partial case of backward pass for a convolutional layer, with particular stride and padding. I think you've done it already, but here's a naive (and not very efficient) implementation for any stride and padding:

# input: x, w, b, stride, pad, d_out
# output: dx, dw, db <- gradients with respect to x, w, and b

N, C, H, W = x.shape
F, C, HH, WW = w.shape
N, C, H_out, W_out = d_out.shape

x_pad = np.pad(x, pad_width=((0, 0), (0, 0), (pad, pad), (pad, pad)), mode='constant', constant_values=0)

db = np.sum(d_out, axis=(0, 2, 3))

dw = np.zeros_like(w)
dx = np.zeros_like(x_pad)
for n in xrange(N):
  for f in xrange(F):
    filter_w = w[f, :, :, :]
    for out_i, i in enumerate(xrange(0, H, stride)):
      for out_j, j in enumerate(xrange(0, W, stride)):
        dw[f, :, :, :] += d_out[n, f , out_i, out_j] * x_pad[n, :, i:i+HH, j:j+WW]
        dx[n, :, i:i+HH, j:j+WW] += filter_w * d_out[n, f, out_i, out_j]
dx = dx[:,:,1:H+1,1:W+1]

使用im2colcol2im可以更有效地完成此操作,但这只是实现细节.另一个有趣的事实:卷积运算(包括数据和权重)的后向传递仍然是卷积,但具有空间翻转的滤波器.

The same can be done more efficiently using im2col and col2im, but it's just an implementation detail. Another funny fact: the backward pass for a convolution operation (for both the data and the weights) is again a convolution, but with spatially-flipped filters.

这里是应用方式(简单的SGD):

Here's how it's applied (plain simple SGD):

# backward_msg is the message from the next layer, usually ReLu
# conv_cache holds (x, w, b, conv_params), i.e. the info from the forward pass
backward_msg, dW, db = conv_backward(backward_msg, conv_cache)
w = w - learning_rate * dW
b = b - learning_rate * db

如您所见,这非常简单,只需要了解您正在应用相同的旧卷积即可.

As you can see, it's pretty straightforward, just need to understand that you're applying same old convolution.

这篇关于如何更新反卷积层的权重?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆