pytorch 中的自适应池是如何工作的? [英] How does adaptive pooling in pytorch work?

查看:23
本文介绍了pytorch 中的自适应池是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

自适应池是一个很棒的功能,但它是如何工作的呢?它似乎以一种看起来像图案但相当随意的方式插入焊盘或缩小/扩展内核大小.我能找到的 pytorch 文档并不比在此处放置所需的输出大小"更具描述性.有谁知道这是如何工作的,或者可以指出它的解释位置?

1x1x6 张量 (1,2,3,4,5,6) 上的一些测试代码,具有大小为 8 的自适应输出:

导入火炬将 torch.nn 导入为 nn类测试网(nn.Module):def __init__(self):super(TestNet, self).__init__()self.avgpool = nn.AdaptiveAvgPool1d(8)def forward(self,x):打印(x)x = self.avgpool(x)打印(x)返回 x定义测试():x = torch.Tensor([[[[1,2,3,4,5,6]]])净 = 测试网()y = 净(x)返回 y测试()

输出:

张量([[[ 1., 2., 3., 4., 5., 6.]]])张量([[[ 1.0000, 1.5000, 2.5000, 3.0000, 4.0000, 4.5000, 5.5000,6.0000]]])

如果它在左侧和右侧镜像 pads(在 (1,1,2,3,4,5,6,6) 上操作),并且内核为 2,那么除了4 和 5 是有道理的,当然,输出的大小不合适.它也在内部填充 3 和 4 吗?如果是这样,它在 (1,1,2,3,3,4,4,5,6,6) 上运行,如果使用大小为 2 的内核,则会产生错误的输出大小,也会错过 3.5 的输出.是不是改变了内核的大小?

我是否遗漏了一些明显的工作方式?

解决方案

一般来说,池化会减少维度.如果您想增加维度,您可能需要查看插值.>

无论如何,让我们谈谈一般的自适应池.你可以查看源代码这里.一些人声称自适应池与标准池相同,步长和内核大小根据输入和输出大小计算.具体使用以下参数:

  1. stride = (input_size//output_size)
  2. 内核大小 = input_size - (output_size-1)*stride
  3. 填充 = 0

这些与池化公式相反.虽然它们DO产生所需大小的输出,但其输出不一定与自适应池化的输出相同.这是一个测试片段:

导入火炬将 torch.nn 导入为 nnin_length = 5out_length = 3x = torch.arange(0, in_length).view(1, 1, -1).float()打印(x)步幅 = (in_length//out_length)avg_pool = nn.AvgPool1d(步幅=步幅,kernel_size=(in_length-(out_length-1)*stride),填充=0,)自适应池 = nn.AdaptiveAvgPool1d(out_length)打印(avg_pool.stride,avg_pool.kernel_size)y_avg = avg_pool(x)y_ada = 自适应池(x)打印(y_avg)打印(y_ada)

输出:

张量([[[0., 1., 2., 3., 4.]]])(1,) (​​3,)张量([[[1., 2., 3.]]])张量([[[0.5000, 2.0000, 3.5000]]])错误:1.0

来自元素 (0, 1, 2), (1, 2, 3) 和 (2, 3, 4) 的平均池化池.

来自元素 (0, 1)、(1, 2, 3) 和 (3, 4) 的自适应池化池.(稍微更改代码以查看它不是仅来自(2)的池)

  • 您可以告诉自适应池尝试减少池中的重叠.
  • 使用带有 count_include_pad=True 的填充可以减轻这种差异,但总的来说,我认为对于 2D 或更高版本的所有输入/输出大小,它们不会完全相同.我会想象为左/右使用不同的填充.目前,池化层不支持此功能.
  • 从实践的角度来看,这应该没什么关系.
  • 检查代码实际实施.

Adaptive pooling is a great function, but how does it work? It seems to be inserting pads or shrinking/expanding kernel sizes in what seems like a pattered but fairly arbitrary way. The pytorch documentation I can find is not more descriptive than "put desired output size here." Does anyone know how this works or can point to where it's explained?

Some test code on a 1x1x6 tensor, (1,2,3,4,5,6), with an adaptive output of size 8:

import torch
import torch.nn as nn

class TestNet(nn.Module):
    def __init__(self):
        super(TestNet, self).__init__()
        self.avgpool = nn.AdaptiveAvgPool1d(8)

    def forward(self,x):
        print(x)
        x = self.avgpool(x)
        print(x)
        return x

def test():
    x = torch.Tensor([[[1,2,3,4,5,6]]])
    net = TestNet()
    y = net(x)
    return y

test()

Output:

tensor([[[ 1.,  2.,  3.,  4.,  5.,  6.]]])
tensor([[[ 1.0000,  1.5000,  2.5000,  3.0000,  4.0000,  4.5000,  5.5000,
       6.0000]]])

If it mirror pads by on the left and right (operating on (1,1,2,3,4,5,6,6)), and has a kernel of 2, then the outputs for all positions except for 4 and 5 make sense, except of course the output isn't the right size. Is it also padding the 3 and 4 internally? If so, it's operating on (1,1,2,3,3,4,4,5,6,6), which, if using a size 2 kernel, produces the wrong output size and would also miss a 3.5 output. Is it changing the size of the kernel?

Am I missing something obvious about the way this works?

解决方案

In general, pooling reduces dimensions. If you want to increase dimensions, you might want to look at interpolation.

Anyway, let's talk about adaptive pooling in general. You can look at the source code here. Some claimed that adaptive pooling is the same as standard pooling with stride and kernel size calculated from input and output size. Specifically, the following parameters are used:

  1. Stride = (input_size//output_size)
  2. Kernel size = input_size - (output_size-1)*stride
  3. Padding = 0

These are inversely worked from the pooling formula. While they DO produce output of the desired size, its output is not necessarily the same as that of adaptive pooling. Here is a test snippet:

import torch
import torch.nn as nn

in_length = 5
out_length = 3

x = torch.arange(0, in_length).view(1, 1, -1).float()
print(x)

stride = (in_length//out_length)
avg_pool = nn.AvgPool1d(
        stride=stride,
        kernel_size=(in_length-(out_length-1)*stride),
        padding=0,
    )
adaptive_pool = nn.AdaptiveAvgPool1d(out_length)

print(avg_pool.stride, avg_pool.kernel_size)

y_avg = avg_pool(x)
y_ada = adaptive_pool(x)

print(y_avg)
print(y_ada)

Output:

tensor([[[0., 1., 2., 3., 4.]]])
(1,) (3,)
tensor([[[1., 2., 3.]]])
tensor([[[0.5000, 2.0000, 3.5000]]])
Error:  1.0

Average pooling pools from elements (0, 1, 2), (1, 2, 3) and (2, 3, 4).

Adaptive pooling pools from elements (0, 1), (1, 2, 3) and (3, 4). (Change the code a bit to see that it is not pooling from (2) only)

  • You can tell adaptive pooling tries to reduce overlapping in pooling.
  • The difference can be mitigated using padding with count_include_pad=True, but in general I don't think they can be exactly the same for 2D or higher for all input/output sizes. I would imagine using different paddings for left/right. This is not supported in pooling layers for the moment.
  • From a practical perspective it should not matter much.
  • Check the code for actual implementation.

这篇关于pytorch 中的自适应池是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆