如何在numpy中为CNN实现反卷积层? [英] How can I implement deconvolution layer for a CNN in numpy?

查看:352
本文介绍了如何在numpy中为CNN实现反卷积层?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试为卷积网络实现反卷积层.反卷积的意思是,假设我将3x227x227输入的图像输入到具有大小为3x11x11且跨度为4的滤镜的图层,因此生成的特征贴图的大小为55x55.我想做的是在将55x55特征图再次投影到3x227x227图像的情况下应用相反的操作.基本上,55x55特征图上的每个值都由3x11x11过滤器加权,并投影到图像空间,并且对由于步幅而产生的重叠区域进行平均.

I try to implement Deconvolution layer for a Convolution Network. What I mean by deconvolution is that suppose I have 3x227x227 input image to a layer with filters in size 3x11x11 and stride 4. Hence the resulting feature map has size 55x55. What I try to do is to apply the reverse operation where I project 55x55 feature map to again 3x227x227 image. Basically each value on 55x55 feature map is weighted by 3x11x11 filters and projected to image space and overlapping regions due to stride is averaged.

我尝试在numpy中实现它,但没有成功.我发现了一个蛮力嵌套在循环中的解决方案,但是该死的速度太慢了.如何有效地在numpy中实现它?欢迎任何帮助.

I tried to implement it in numpy without any success. I found the solution with a brute-force nested for loops but it is damn slow. How can I implement it in numpy efficiently? Any help is welcome.

推荐答案

As discussed in this question, a deconvolution is just a convolutional layer, but with a particular choice of padding, stride and filter size.

例如,如果您当前的图像大小为55x55,则可以对padding=20stride=1filter=[21x21]进行卷积以获取75x75图像,然后是95x95,依此类推. (我并不是说对数字的选择可以提供输出图像所需的品质,只是大小.实际上,我认为从227x227降采样到55x55,然后再升采样回到过于激进,但您可以自由尝试任何架构.

For example, if your current image size is 55x55, you can apply a convolution with padding=20, stride=1 and filter=[21x21] to obtain a 75x75 image, then 95x95 and so on. (I'm not saying this choice of numbers gives the desired quality of the output image, just the size. Actually, I think downsampling from 227x227 to 55x55 and then upsampling back to 227x227 is too aggressive, but you are free to try any architecture).

这里是任何步幅和填充的前向传递的实现.它会 im2col转换,但使用的是来自numpy.它没有像现代GPU实施那样优化,但是绝对比

Here's the implementation of a forward pass for any stride and padding. It does im2col transformation, but using stride_tricks from numpy. It's not as optimized as modern GPU implementations, but definitely faster than 4 inner loops:

import numpy as np

def conv_forward(x, w, b, stride, pad):
  N, C, H, W = x.shape
  F, _, HH, WW = w.shape

  # Check dimensions
  assert (W + 2 * pad - WW) % stride == 0, 'width does not work'
  assert (H + 2 * pad - HH) % stride == 0, 'height does not work'

  # Pad the input
  p = pad
  x_padded = np.pad(x, ((0, 0), (0, 0), (p, p), (p, p)), mode='constant')

  # Figure out output dimensions
  H += 2 * pad
  W += 2 * pad
  out_h = (H - HH) / stride + 1
  out_w = (W - WW) / stride + 1

  # Perform an im2col operation by picking clever strides
  shape = (C, HH, WW, N, out_h, out_w)
  strides = (H * W, W, 1, C * H * W, stride * W, stride)
  strides = x.itemsize * np.array(strides)
  x_stride = np.lib.stride_tricks.as_strided(x_padded,
                                             shape=shape, strides=strides)
  x_cols = np.ascontiguousarray(x_stride)
  x_cols.shape = (C * HH * WW, N * out_h * out_w)

  # Now all our convolutions are a big matrix multiply
  res = w.reshape(F, -1).dot(x_cols) + b.reshape(-1, 1)

  # Reshape the output
  res.shape = (F, N, out_h, out_w)
  out = res.transpose(1, 0, 2, 3)
  out = np.ascontiguousarray(out)
  return out

这篇关于如何在numpy中为CNN实现反卷积层?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆