Pytorch:定义自定义函数 [英] Pytorch: define custom function

查看:56
本文介绍了Pytorch:定义自定义函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想编写自己的激活函数,但遇到了问题.说矩阵乘法将调用 .data.我进行了搜索,但几乎没有得到有用的信息.任何帮助将不胜感激.错误信息为

I wanted to write my own activation function, but I got a problem. Saying the matrix multiplication will call .data. I searched but got little useful information. Any help will be appreciated. The error information is

 Traceback (most recent call last):
      File "defineAutogradFuncion.py", line 126, in <module>
        test = gradcheck(argmin, input, eps=1e-6, atol=1e-4)
      File "/home/zhaosl/.local/lib/python2.7/site-packages/torch/autograd/gradcheck.py", line 154, in gradcheck
        output = func(*inputs)
      File "defineAutogradFuncion.py", line 86, in forward
        output = output.mm(dismap).squeeze(-1)
      File "/home/zhaosl/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 578, in mm
        output = Variable(self.data.new(self.data.size(0), matrix.data.size(1)))
      File "/home/zhaosl/.local/lib/python2.7/site-packages/torch/tensor.py", line 374, in data
        raise RuntimeError('cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?')
    RuntimeError: cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?

    class Softargmin(torch.autograd.Function):
    """
    We can implement our own custom autograd Functions by subclassing
    torch.autograd.Function and implementing the forward and backward passes
    which operate on Tensors.
    """
    @staticmethod
    def forward(self, input):
        """
        In the forward pass we receive a Tensor containing the input and return a
        Tensor containing the output. You can cache arbitrary Tensors for use in the
        backward pass using the save_for_backward method.
        """
        #P = Fun.softmax(-input)
        inputSqueeze = input.squeeze(-1)
        P = Fun.softmax(-inputSqueeze)
        self.save_for_backward(P)

        output = P.permute(0,2,3,1)
        dismap = torch.arange(0,output.size(-1)+1).unsqueeze(1)
        output = output.mm(dismap).squeeze(-1)
       return output
    @staticmethod
    def backward(self, grad_output):
        """
        In the backward pass we receive a Tensor containing the gradient of the loss
        with respect to the output, and we need to compute the gradient of the loss
        with respect to the input.
        """
        P, = self.saved_tensors
        P = P.unsqueeze(-1)
        Pk = torch.squeeze(P,-1).permute(0,2,3,1)
        k = torch.arange(0,Pk.size(-1)+1).unsqueeze(1)
        sumkPk = Pk.mm(k)
        sumkPk = sumkPk.unsqueeze(1).expand(P.size())
        i = torch.arange(0,Pk.size(-1)+1).view(1,-1,1,1,1).expand(P.size())
        grad_output_expand =grad_output.unsqueeze(-1).unsqueeze(1).expand(P.size())
        grad_input = grad_output_expand*P*(sumkPk-i)
        return grad_input

推荐答案

PyTorch 中最基本的元素是一个 Tensor,它相当于 numpy.ndarray唯一的区别是 Tensor 可以放在 GPU 上进行任何计算.

The most basic element in PyTorch is a Tensor, which is the equivalent of numpy.ndarray with the only difference being that a Tensor can be put onto a GPU for any computation.

A VariableTensor 的包装器,它包含三个属性:datagradgrad_fn.data 包含原始的Tensorgrad 包含关于这个 Variable 的某个值的导数/梯度;而 grad_fn 是指向创建此 VariableFunction 对象的指针.grad_fn 属性实际上是 autograd 正常工作的关键,因为 PyTorch 使用这些指针在每次迭代时构建计算图并对所有 变量进行微分 相应地在您的图表中.这不仅仅是通过您正在创建的这个自定义 Function 对象正确区分.

A Variable is a wrapper around Tensor that contains three attributes: data, grad and grad_fn. data contains the original Tensor; grad contains the derivative/gradient of some value with respect to this Variable; and grad_fn is a pointer to the Function object that created this Variable. The grad_fn attribute is actually the key for autograd to work properly since PyTorch uses those pointers to build the computation graph at each iteration and carry out the differentiations for all Variables in your graph accordingly. This is not only about differentiating correctly through this custom Function object you are creating.

因此,每当您在计算中创建一些需要微分的 Tensor 时,请将其包装为 Variable.首先,这将使 Tensor 能够在您调用 backward() 后保存生成的导数/梯度值.其次,这有助于 autograd 构建正确的计算图.

Hence whenever you create some Tensor in your computation that requires differentiation, wrap it as a Variable. First, this would enable the Tensor to be able to save the resulting derivative/gradient value after you call backward(). Second, this helps autograd build a correct computation graph.

要注意的另一件事是,每当您将 Variable 发送到计算图中时,使用此 Variable 计算的任何值都将自动成为 Variable.因此,您不必在计算图中手动包装所有 Tensors.

Another thing to notice is that whenever you send a Variable into your computation graph, any value that is computed using this Variable will automatically be a Variable. So you don't have to manually wrap all Tensors in your computation graph.

您可能想看看这个.

回到您的错误,找出真正导致问题的原因有点困难,因为您没有显示所有代码(例如您如何在您的代码中使用此自定义 Function计算图),但我怀疑最有可能发生的情况是您在需要微分的子图中使用了这个 Function,当 PyTorch 在您的模型上使用数值梯度检查以查看微分时是正确的,它假设该子图中的每个节点都是 Variable 因为这是通过该子图进行区分所必需的,然后它尝试调用该子图中的 data 属性Variable,很可能是因为该值在微分中的某个地方使用,并且失败,因为该节点实际上是一个 Tensor 并且没有 data> 属性.

Going back to your error, it's a little difficult to figure out what is really causing the trouble because you are not showing all of your code (information like how you are using this custom Function in your computation graph), but I suspect that what most likely has happened is that you used this Function in a subgraph that required to be differentiated through, when PyTorch used numerical gradient check on your model to see if the differentiation is correct, it assumed that every node in that subgraph was a Variable because that is necessary for differentiation through that subgraph to happen, then it tried to call the data attribute of that Variable, most likely because that value is used somewhere in the differentiation, and failed because that node was in fact a Tensor and did not have a data attribute.

这篇关于Pytorch:定义自定义函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆