pytorch backprop 通过 volatile 变量错误 [英] pytorch backprop through volatile variable error

查看:30
本文介绍了pytorch backprop 通过 volatile 变量错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过多次反向传递迭代并在每一步更新输入来最小化相对于某个目标的一些输入.第一遍成功运行,但在第二遍时出现以下错误:<代码>运行时错误:变量元组的元素 0 是 volatile

I'm trying to minimize some input relative to some target by running it through several backward pass iterations and updating the input at each step. The first pass runs successfully but I get the following error on the second pass: RuntimeError: element 0 of variables tuple is volatile

此代码片段演示了问题

import torch
from torch.autograd import Variable
import torch.nn as nn

inp = Variable(torch.Tensor([1]), requires_grad=True)
target = Variable(torch.Tensor([3]))

loss_fn = nn.MSELoss()

for i in range(2):
    loss = loss_fn(inp, target)
    loss.backward()
    gradient = inp.grad
    inp = inp - inp.grad * 0.01

当我检查 inp 的值时,在最后一行重新分配之前,inp.volatile =>Falseinp.requires_grad =>True 但在重新分配后,这些开关分别切换为 TrueFalse.为什么作为 volatile 变量会阻止第二次反向传播?

When I inspect the value of inp, before it is reassigned on the last line, inp.volatile => False and inp.requires_grad => True but after it is reassigned those switch to True and False, respectively. Why does being a volatile variable prevent the second backprop run?

推荐答案

您必须在每次更新之前将梯度归零,如下所示:

You must zero out the gradient before each update like this:

inp.grad.data.zero_()

但是在您的代码中,每次更新渐变时,您都会创建另一个 Variable 对象,因此您必须像这样更新整个历史记录:

But in your code every time you update the gradient you are creating another Variable object, so you must update entire history like this:

import torch
from torch.autograd import Variable
import torch.nn as nn

inp_hist = []
inp = Variable(torch.Tensor([1]), requires_grad=True)
target = Variable(torch.Tensor([3]))

loss_fn = nn.MSELoss()

for i in range(2):
    loss = loss_fn(inp, target)
    loss.backward()
    gradient = inp.grad
    inp_hist.append(inp)
    inp = inp - inp.grad * 0.01
    for inp in inp_hist:
        inp.grad.data.zero_()

但是通过这种方式,您将为您在历史记录中创建的所有先前输入计算梯度(这很糟糕,它浪费了一切),正确的实现如下所示:

But this way you will compute the gradient for all previous inputs you have created in the history(and it's bad, it's a wast of everything), a correct implementation looks like this:

import torch
from torch.autograd import Variable
import torch.nn as nn
inp = Variable(torch.Tensor([1]), requires_grad=True)
target = Variable(torch.Tensor([3]))
loss_fn = nn.MSELoss()
for i in range(2):
    loss = loss_fn(inp, target)
    loss.backward()
    gradient = inp.grad
    inp.data = inp.data - inp.grad.data * 0.01
    inp.grad.data.zero_()

这篇关于pytorch backprop 通过 volatile 变量错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆