全卷积网络接受域 [英] Fully Convolutional Network Receptive Field

查看:141
本文介绍了全卷积网络接受域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于感受野的计算有很多问题. 此处对此进行了很好的解释.

There are many questions regarding the calculation of the receptive field. It is explained very well here on StackOverflow.

但是,没有关于如何在完全卷积层中进行计算的博客或教程,即具有残差块,要素地图串联和上采样层(例如要素金字塔网络)的计算.

However, there are no blogs or tutorials on how to calculate it in fully convolutional layer i.e. with residual blocks, feature map concatenation and upsampling layers (like feature pyramid network).

  1. 据我所知,剩余的块和跳过连接不会影响接收字段,因此可以跳过.从此处回答.

上采样层如何处理?例如我们有一个900的有效接收场,接着是一个上采样层,接收场会减半吗?

How are upsampling layers handled? For e.g. we have the effective receptive field of 900 and an upsampling layer follows, does the receptive field get halved?

与先前图层中的特征图连接时,感受野会发生变化吗?

Does the receptive field change when concatenated with feature maps from prior layers?

提前谢谢!

推荐答案

要逐步回答您的问题,让我们首先从以下上下文中的接受域定义开始:

To answer your question piece by piece, let us first start with the definition of the receptive field in this context:

单个感觉神经元的感受野是感觉空间的特定区域(例如,身体表面或视野),刺激会在该区域改变该神经元的放电.

The receptive field of an individual sensory neuron is the particular region of the sensory space (e.g., the body surface, or the visual field) in which a stimulus will modify the firing of that neuron.

取自Wikipedia .这意味着我们正在寻找您输入中的所有像素 ,这些像素会影响当前输出.从逻辑上讲,如果执行卷积运算(例如,使用单个3x3滤镜内核),则单个像素的接受场就是输入区域中对应于该特定步骤进行卷积的3x3图像区域.

As taken from Wikipedia. This means we are looking for all pixels in your input that affect the current output. Logically, if you perform a convolution -say for example with a single 3x3 filter kernel - the receptive field of a single pixel is the corresponding 3x3 image region in the input area that gets convolved in that specific step.

在视觉上,在此图形中,下面的较暗区域标记了输出中特定像素的接收场:

Visually, in this graphic the underlying darker area marks the receptive field for specific pixels in the output:

现在,要回答您的第一个问题:剩余的障碍物当然仍然是感受域!让我们将剩余块表示为:

Now, to answer your first question: Residual blocks of course still account for the receptive field! Let us denote the residual block as follows:

  • F(X):残余块
  • g_i(X):单个卷积块
  • F(X): residual block
  • g_i(X): single convolutional block

然后我们可以将残差块表示为F(X) = g_3(g_2(g_1(X))) + X,因此在这种情况下,我们将堆叠3个卷积(作为示例).当然,该卷积的每个单层仍会改变接收场,因为它与开头所述的相同. 简单地再次添加X不会改变接受域.但是,仅此一项添加并不会构成剩余的障碍.

Then we can denote the residual block as F(X) = g_3(g_2(g_1(X))) + X, so in this case we would stack 3 convolutions (as an example). Of course, every single layers of this convolution still alters the receptive field, since it is the same as explained in the beginning. Simply adding X again will not change the receptive field, of course. But that addition alone does not make an residual block.

类似地,跳过连接不会以如下方式影响接收场:跳过层几乎总是会导致不同的(大部分是较小的)接收场.但是,正如您在链接的答案中所解释的那样,如果跳过连接的接受域较大,则会有所不同,因为接受域是路径中不同区域的最大值(更具体地讲,并集)通过您的流程图.

Similarly, skip connections do no affect the receptive field in the way that skipping layers will almost always result in a different (mostly smaller) receptive field. As explained in your linked answer though, it will make a difference if your skip connection has a larger receptive field, since the receptive field is the maximum (more specifically, union) of the different regions of your paths through your flow graph.

对于有关上采样层的问题,您可以通过询问以下问题自己猜出答案: 输入图像的区域是否会受到图像中任何地方的上采样影响?

For the question about upsampling layers, you can guess the answer yourself by asking the following question: Does the area of the input image get affected by upsampling anywhere within the image?

答案应该是显然不是".从本质上讲,尽管您现在具有更高的分辨率,但是您仍然在输入区域中查看相同区域,并且实际上相似的像素可能会在同一区域中查看.回到上面的GIF:如果绿色区域中的像素数是4倍,则每个像素仍然必须查看蓝色区域中大小不变的特定输入区域.因此,不,升级不会对此产生影响.

The answer should be "obviously not". Essentially, you are still looking at the same area in the input area, although now you have a higher resolution, and similar pixels might in fact look at the same area. To get back to the GIF above: If you had 4x the number of pixels in the green area, every pixel still would have to look at a particular input region in the blue area that does not change in size. So no, upscaling does not affect this.

对于最后一个问题:这与第一个问题非常相关.实际上,感受野会查看影响输出的所有像素,因此根据您要连接的要素图,它可能会改变它.

For the last question: This is very related to the first question. In fact, the receptive field looks at all the pixels that affect the output, so depending on which feature maps you are concatenating, it might change it.

同样,生成的接受域是要连接的要素图的接受域的并集.如果它们相互包含(A subset of BB subset of A,其中AB是要连接的特征图),则接收字段不会更改.否则,接收字段将为A union B.

Again, the resulting receptive field is the union of the receptive fields of the feature maps you are concatenating. If they are contained in one another (either A subset of B or B subset of A, where A and B are the feature maps to be concatenated), then the receptive field does not change. Otherwise, the receptive field would be A union B.

这篇关于全卷积网络接受域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆