使用 nn.Sequential() 实现 dropout 层 [英] implement dropout layer using nn.Sequential()

查看:294
本文介绍了使用 nn.Sequential() 实现 dropout 层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 pytorch 实现一个 Dropout 层,如下所示:

I am trying to implement a Dropout layer using pytorch as follows:

class DropoutLayer(nn.Module):
    def __init__(self, p):
        super().__init__()
        self.p = p

    def forward(self, input):
        if self.training:
            u1 = (np.random.rand(*input.shape)<self.p) / self.p
            u1 *= u1
            return u1
        else:
            input *= self.p

然后调用一个简单的 NN.sequential:

And then calling a simple NN.sequential:

model = nn.Sequential(nn.Linear(input_size,num_classes), DropoutLayer(.7), nn.Flatten())

opt = torch.optim.Adam(model.parameters(), lr=0.005)
train(model, opt, 5) #train(model, optimizer, epochs #)

但我收到以下错误:

TypeError: flatten() takes at most 1 argument (2 given)

不确定我做错了什么.pytorch 还是新手.谢谢.

Not sure what I'm doing wrong. Still new to pytorch. Thanks.

推荐答案

在你的DropoutLayerforward函数中,当你输入else 分支,没有回报.因此接下来的层 (flatten) 将没有输入.然而,正如评论中所强调的,这并不是真正的问题.

In the forward function of your DropoutLayer, when you enter the elsebranch, there is no return. Therefore the following layer (flatten) will have no input. However, as emphasized in the comments, that's not the actual problem.

实际问题是您将一个 numpy 数组传递给您的 Flatten 层.重现问题的最小代码是:

The actual problem is that you are passing a numpy array to your Flatten layer. A Minimal code to reproduce the problem would be :

nn.Flatten()(np.random.randn(5,5))
>>> TypeError: flatten() takes at most 1 argument (2 given)

然而,我无法解释为什么这一层在 numpy 张量上表现得如此,flatten function 的行为更容易理解.我不知道该层执行了哪些额外的操作.

However, I cannot explain why this layer behaves like that on a numpy tensor, the behavior of the flatten function being much more understandable. I don't know what additional operations the layer performs.

torch.flatten(np.random.randn(5,5))
>>> TypeError: flatten(): argument 'input' (position 1) must be Tensor, not numpy.ndarray

为什么您的代码会引发此错误是因为在前向传递中,您创建了一个 numpy 张量,执行了一些操作,然后返回它而不是返回张量.如果可以的话,您甚至都不会触摸实际的输入张量(在第一个分支中)

Why this error is raised by your code is because in the forward pass, you create a numpy tensor, perform some operations, and return it instead of returning a tensor. If I may, you don't even touch the actual input tensor (in the first branch)

这篇关于使用 nn.Sequential() 实现 dropout 层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆