池化与时间池化 [英] Pooling vs Pooling-over-time

查看:356
本文介绍了池化与时间池化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从概念上理解最大/总和池作为CNN层操作所发生的情况,但是我看到这个术语随时间推移的最大池"或随时间的总池"(例如,(Yoon Kim的论文).有什么区别?

I understand conceptually what is happening in a max/sum pool as a CNN layer operation, but I see this term "max pool over time", or "sum pool over time" thrown around (e.g., "Convolutional Neural Networks for Sentence Classification" paper by Yoon Kim). What is the difference?

推荐答案

max-time-pooling通常在NLP中应用(与普通max-pool不同,CNN在计算机视觉任务中很常见)有点不同.

The max-over-time pooling is usually applied in NLP (unlike ordinary max-pool, which is common in CNNs for computer vision tasks), so the setup is a little bit different.

最大时间池的输入是要素映射c = [c(1), ..., c(n-h+1)],它是根据长度为n的句子使用大小为h的过滤器计算得出的.卷积操作与带有图像的卷积操作非常相似,但是在这种情况下,它被应用于单词的一维向量.这是纸张中的公式(3).

The input to the max-over-time pooling is a feature map c = [c(1), ..., c(n-h+1)], which is computed over a sentence of length n with a filter of size h. The convolution operation is very similar to one with images, but in this case it's applied to 1-dimensional vector of words. This is the formula (3) in the paper.

最大时间池化操作非常简单:max_c = max(c),即,它是单个数字,可在整个特征图中获得最大值.这样做的原因,而不是像在CNN中那样对句子进行下采样",是因为在NLP中,句子在语料库中的长度自然不同.这使得特征图对于不同的句子有所不同,但是我们希望将张量减小到固定大小,以最终应用softmax或回归头.如论文所述,它可以捕获最重要的 特征,每个特征图的价值最高.

The max-over-time pooling operation is very simple: max_c = max(c), i.e., it's a single number that gets a max over the whole feature map. The reason to do this, instead of "down-sampling" the sentence like in a CNN, is that in NLP the sentences naturally have different length in a corpus. This makes the feature maps different for different sentences, but we'd like to reduce the tensor to a fixed size to apply softmax or regression head in the end. As stated in the paper, it allows to capture the most important feature, one with the highest value for each feature map.

请注意,在计算机视觉中,图像通常是 1 ,具有相同的大小,例如28x2832x32,这就是为什么不必立即将特征映射下采样到1x1的原因.

Note that in computer vision, images are usually1 of the same size, like 28x28 or 32x32, that's why it is unnecessary to downsample the feature maps to 1x1 immediately.

总的累计时间相同.

1 现代CNN可以使用不同大小的图像进行训练,但这需要网络是全卷积的,因此它没有任何池化层.有关更多详细信息,请参见此问题.

1 Modern CNN can be trained with images of different size, but this requires the network to be all-convolutional, so it doesn't have any pooling layers. See this question for more details.

这篇关于池化与时间池化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆