PyTorch-参数不变 [英] PyTorch - parameters not changing

查看:454
本文介绍了PyTorch-参数不变的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了了解pytorch的工作原理,我试图对多元正态分布中的某些参数进行最大似然估计.但是,它似乎不适用于任何与协方差相关的参数.

In an effort to learn how pytorch works, I am trying to do maximum likelihood estimation of some of the parameters in a multivariate normal distribution. However it does not seem to work for any of the covariance related parameters.

所以我的问题是:为什么这段代码不起作用?

So my question is: why does this code not work?

import torch


def make_covariance_matrix(sigma, rho):
    return torch.tensor([[sigma[0]**2, rho * torch.prod(sigma)],
                         [rho * torch.prod(sigma), sigma[1]**2]])


mu_true = torch.randn(2)
rho_true = torch.rand(1)
sigma_true = torch.exp(torch.rand(2))

cov_true = make_covariance_matrix(sigma_true, rho_true)
dist_true = torch.distributions.MultivariateNormal(mu_true, cov_true)

samples = dist_true.sample((1_000,))

mu = torch.zeros(2, requires_grad=True)
log_sigma = torch.zeros(2, requires_grad=True)
atanh_rho = torch.zeros(1, requires_grad=True)

lbfgs = torch.optim.LBFGS([mu, log_sigma, atanh_rho])


def closure():
    lbfgs.zero_grad()
    sigma = torch.exp(log_sigma)
    rho = torch.tanh(atanh_rho)
    cov = make_covariance_matrix(sigma, rho)
    dist = torch.distributions.MultivariateNormal(mu, cov)
    loss = -torch.mean(dist.log_prob(samples))
    loss.backward()
    return loss


lbfgs.step(closure)

print("mu: {}, mu_hat: {}".format(mu_true, mu))
print("sigma: {}, sigma_hat: {}".format(sigma_true, torch.exp(log_sigma)))
print("rho: {}, rho_hat: {}".format(rho_true, torch.tanh(atanh_rho)))

输出:

mu: tensor([0.4168, 0.1580]), mu_hat: tensor([0.4127, 0.1454], requires_grad=True)
sigma: tensor([1.1917, 1.7290]), sigma_hat: tensor([1., 1.], grad_fn=<ExpBackward>)
rho: tensor([0.3589]), rho_hat: tensor([0.], grad_fn=<TanhBackward>)

>>> torch.__version__
'1.0.0.dev20181127'

换句话说,为什么log_sigmaatanh_rho的估计值没有偏离其初始值?

In other words, why have the estimates of log_sigma and atanh_rho not moved from their initial value?

推荐答案

创建协方差矩阵的方法不是 backprob-able

The way you create your covariance matrix is not backprob-able:

def make_covariance_matrix(sigma, rho):
    return torch.tensor([[sigma[0]**2, rho * torch.prod(sigma)],
                         [rho * torch.prod(sigma), sigma[1]**2]])

从(多个)张量创建新张量时,将仅保留输入张量的值.输入张量中的所有其他信息都将被剥夺,因此从该点开始将所有 graph-connection 都切入参数,因此反向传播无法通过.

When creating a new tensor from (multiple) tensors, only the values of your input tensors will be kept. All additional information from the input tensors is stripped away, thus all graph-connection to your parameters is cut from this point, therefore backpropagation cannot get through.

以下是一个简短的示例来说明这一点:

import torch

param1 = torch.rand(1, requires_grad=True)
param2 = torch.rand(1, requires_grad=True)
tensor_from_params = torch.tensor([param1, param2])

print('Original parameter 1:')
print(param1, param1.requires_grad)
print('Original parameter 2:')
print(param2, param2.requires_grad)
print('New tensor form params:')
print(tensor_from_params, tensor_from_params.requires_grad)

输出:

Original parameter 1:
tensor([ 0.8913]) True
Original parameter 2:
tensor([ 0.4785]) True
New tensor form params:
tensor([ 0.8913,  0.4785]) False

如您所见,由参数param1param2创建的张量无法跟踪param1param2的梯度.

As you can see the tensor, created from the parameters param1 and param2, does not keep track of the gradients of param1 and param2.


因此,您可以使用此代码来保持图形连接并且可以 backprob-able :

def make_covariance_matrix(sigma, rho):
    conv = torch.cat([(sigma[0]**2).view(-1), rho * torch.prod(sigma), rho * torch.prod(sigma), (sigma[1]**2).view(-1)])
    return conv.view(2, 2)

使用torch.cat将值连接到平坦张量.然后使用view()将其调整为正确的形状.
这会产生与函数中相同的矩阵输出,但会保持与参数log_sigmaatanh_rho的连接.

The values are concatenated to a flat tensor using torch.cat. Then they are brought into right shape using view().
This results in the same matrix output as in your function, but it keeps the connection to your parameters log_sigma and atanh_rho.

这是在更改了make_covariance_matrix的步骤之前和之后的输出.如您所见,现在您可以优化参数,并且值也会改变:

Here is an output before and after the step with the changed make_covariance_matrix. As you can see, now you can optimize your parameters and the values do change:

Before:
mu: tensor([ 0.1191,  0.7215]), mu_hat: tensor([ 0.,  0.])
sigma: tensor([ 1.4222,  1.0949]), sigma_hat: tensor([ 1.,  1.])
rho: tensor([ 0.2558]), rho_hat: tensor([ 0.])

After:
mu: tensor([ 0.1191,  0.7215]), mu_hat: tensor([ 0.0712,  0.7781])
sigma: tensor([ 1.4222,  1.0949]), sigma_hat: tensor([ 1.4410,  1.0807])
rho: tensor([ 0.2558]), rho_hat: tensor([ 0.2235])

希望这会有所帮助!

这篇关于PyTorch-参数不变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆