卷积神经网络:每个滤镜将覆盖多少像素? [英] Convolutional Neural Networks: How many pixels will be covered by each of the filters?

查看:170
本文介绍了卷积神经网络:每个滤镜将覆盖多少像素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何计算网络中每个过滤器覆盖的面积(在原始图像中)?

How can I calculate the area (in the original image) covered by each of the filters in my network?

例如 可以说图像的大小是WxW像素. 我正在使用以下网络:

e.g. Lets say the size of the image is WxW pixels. I am using the following network:

layer 1 : conv :  5x5
layer 2 : pool :  3x3
layer 3 : conv :  5x5
.....
layer N : conv :  5x5

我想计算每个滤镜将覆盖原始图像中的多少区域.

I want to calculate how much area in the original image will be covered by each filter.

例如第1层的滤镜将覆盖原始图像中的5x5像素.

e.g. the filter in the layer 1 will cover 5x5 pixels in the original image.

推荐答案

类似的问题是,每次激活将覆盖多少像素?这与输入图像必须有多大才能在层中产生一次激活完全相同?

A similar problem would be, how many pixels will be covered by each activation? which is essentially the same as, how large an input image has to be in order to produce exactly one activation in a layer?

假设层的过滤器大小和步幅为ks,输入的大小为x*x,我们有(((x-k1+1)/s1-k2+1)/s2.../sn)=1,并且x可以轻松解决.

Say the filter size and stride of a layer is k and s, the size of the input is x*x, we have (((x-k1+1)/s1-k2+1)/s2.../sn)=1, and x can be solved easily.

最初的问题等同于,要在不考虑最后一层的跨度的情况下,要在一个层中产生一个完整的激活,输入图像必须有多大?

The original question is equivalent to, how large an input image has to be in order to produce exactly one activation in a layer, without considering the stride of the last layer?

答案是x/sn,应该由以下伪代码计算

So the answer is x/sn, which should be computed by the following pseudocode

x = layer[n].k
from i = n-1 to 1
   x = x*layer[i].s + layer[i].k - 1

则像素总数为x*x.

在您的示例中,第一层的sum_1d是5,第二层的是5 * 1 + 3-1 = 7,第三层是5 * 3 + 2 + 4 = 21(我假设池层是不重叠的,s = 3).

In your example, the sum_1d for the first layer is 5, for the second layer is 5*1+3-1=7, the third is 5*3+2+4=21 (I'm assuming the pooling layer is non-overlapping, s=3)..

您可以通过执行相反的操作来验证这一点,例如输入为21 * 21,在第一层之后为17 * 17,在合并之后为(17-2)/3 = 5(实际上是16 * 16和15 * 15将产生相同的结果),正好适合第三层的一个过滤器.

You can verify this by doing the reverse, say the input is 21*21, after the first layer it is 17*17, after pooling it is (17-2)/3=5 (actually 16*16 and 15*15 will give the same result), which fits exactly into one filter in the third layer.

这篇关于卷积神经网络:每个滤镜将覆盖多少像素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆