PyTorch - 参数不变 [英] PyTorch - parameters not changing

查看:35
本文介绍了PyTorch - 参数不变的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了了解 pytorch 的工作原理,我尝试对多元正态分布中的某些参数进行最大似然估计.然而,它似乎不适用于任何与协方差相关的参数.

In an effort to learn how pytorch works, I am trying to do maximum likelihood estimation of some of the parameters in a multivariate normal distribution. However it does not seem to work for any of the covariance related parameters.

所以我的问题是:为什么这段代码不起作用?

So my question is: why does this code not work?

import torch


def make_covariance_matrix(sigma, rho):
    return torch.tensor([[sigma[0]**2, rho * torch.prod(sigma)],
                         [rho * torch.prod(sigma), sigma[1]**2]])


mu_true = torch.randn(2)
rho_true = torch.rand(1)
sigma_true = torch.exp(torch.rand(2))

cov_true = make_covariance_matrix(sigma_true, rho_true)
dist_true = torch.distributions.MultivariateNormal(mu_true, cov_true)

samples = dist_true.sample((1_000,))

mu = torch.zeros(2, requires_grad=True)
log_sigma = torch.zeros(2, requires_grad=True)
atanh_rho = torch.zeros(1, requires_grad=True)

lbfgs = torch.optim.LBFGS([mu, log_sigma, atanh_rho])


def closure():
    lbfgs.zero_grad()
    sigma = torch.exp(log_sigma)
    rho = torch.tanh(atanh_rho)
    cov = make_covariance_matrix(sigma, rho)
    dist = torch.distributions.MultivariateNormal(mu, cov)
    loss = -torch.mean(dist.log_prob(samples))
    loss.backward()
    return loss


lbfgs.step(closure)

print("mu: {}, mu_hat: {}".format(mu_true, mu))
print("sigma: {}, sigma_hat: {}".format(sigma_true, torch.exp(log_sigma)))
print("rho: {}, rho_hat: {}".format(rho_true, torch.tanh(atanh_rho)))

输出:

mu: tensor([0.4168, 0.1580]), mu_hat: tensor([0.4127, 0.1454], requires_grad=True)
sigma: tensor([1.1917, 1.7290]), sigma_hat: tensor([1., 1.], grad_fn=<ExpBackward>)
rho: tensor([0.3589]), rho_hat: tensor([0.], grad_fn=<TanhBackward>)

>>> torch.__version__
'1.0.0.dev20181127'

换句话说,为什么 log_sigmaatanh_rho 的估计值没有偏离它们的初始值?

In other words, why have the estimates of log_sigma and atanh_rho not moved from their initial value?

推荐答案

你创建协方差矩阵的方式不是backprob-able:

The way you create your covariance matrix is not backprob-able:

def make_covariance_matrix(sigma, rho):
    return torch.tensor([[sigma[0]**2, rho * torch.prod(sigma)],
                         [rho * torch.prod(sigma), sigma[1]**2]])

从(多个)张量创建新张量时,只会保留输入张量的值.来自输入张量的所有附加信息都被剥离了,因此所有与您的参数的图连接从这一点开始被切断,因此反向传播无法通过.

When creating a new tensor from (multiple) tensors, only the values of your input tensors will be kept. All additional information from the input tensors is stripped away, thus all graph-connection to your parameters is cut from this point, therefore backpropagation cannot get through.

这里有一个简短的例子来说明这一点:

import torch

param1 = torch.rand(1, requires_grad=True)
param2 = torch.rand(1, requires_grad=True)
tensor_from_params = torch.tensor([param1, param2])

print('Original parameter 1:')
print(param1, param1.requires_grad)
print('Original parameter 2:')
print(param2, param2.requires_grad)
print('New tensor form params:')
print(tensor_from_params, tensor_from_params.requires_grad)

输出:

Original parameter 1:
tensor([ 0.8913]) True
Original parameter 2:
tensor([ 0.4785]) True
New tensor form params:
tensor([ 0.8913,  0.4785]) False

如您所见,由参数 param1param2 创建的张量没有跟踪 param1 和 <代码>参数2.

As you can see the tensor, created from the parameters param1 and param2, does not keep track of the gradients of param1 and param2.


因此,您可以使用此代码来保持图形连接并且后向概率:

So instead you can use this code that keeps the graph connection and is backprob-able:

def make_covariance_matrix(sigma, rho):
    conv = torch.cat([(sigma[0]**2).view(-1), rho * torch.prod(sigma), rho * torch.prod(sigma), (sigma[1]**2).view(-1)])
    return conv.view(2, 2)

使用 torch.cat 将这些值连接到平面张量.然后使用 view() 将它们变成正确的形状.
这会产生与您的函数相同的矩阵输出,但它会保持与您的参数 log_sigmaatanh_rho 的连接.

The values are concatenated to a flat tensor using torch.cat. Then they are brought into right shape using view().
This results in the same matrix output as in your function, but it keeps the connection to your parameters log_sigma and atanh_rho.

这是更改了 make_covariance_matrix 的步骤前后的输出.如您所见,现在您可以优化参数并且值确实会发生变化:

Here is an output before and after the step with the changed make_covariance_matrix. As you can see, now you can optimize your parameters and the values do change:

Before:
mu: tensor([ 0.1191,  0.7215]), mu_hat: tensor([ 0.,  0.])
sigma: tensor([ 1.4222,  1.0949]), sigma_hat: tensor([ 1.,  1.])
rho: tensor([ 0.2558]), rho_hat: tensor([ 0.])

After:
mu: tensor([ 0.1191,  0.7215]), mu_hat: tensor([ 0.0712,  0.7781])
sigma: tensor([ 1.4222,  1.0949]), sigma_hat: tensor([ 1.4410,  1.0807])
rho: tensor([ 0.2558]), rho_hat: tensor([ 0.2235])

希望这有帮助!

这篇关于PyTorch - 参数不变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆