tensorflow 是否通过 pdf 传播梯度 [英] Does tensorflow propagate gradients through a pdf

查看:36
本文介绍了tensorflow 是否通过 pdf 传播梯度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设分布函数定义如下:

Lets say, a distribution function is defined as below:

dist = tf.contrib.distributions.Normal(mu, sigma)

从分布中抽取样本

val = dist.pdf(x)

并且该值用于模型中以预测变量

and this value is used in a model to predict a variable

X_hat = f(val)
loss = tf.norm(X_pred-X_hat, ord=2)

如果我想优化变量 mu 和 sigma 以减少我的预测误差,我可以执行以下操作吗?

and if I want to optimize the variables mu and sigma to reduce my prediction error can I do the following?

train = tf.train.AdamOptimizer(1e-03).minimize(loss, var_list=[mu, sigma])

我想知道梯度例程是否通过正态分布传播,或者我应该期待一些问题,因为我在定义分布的参数上采用梯度

I am interested in knowing if the gradient routines are propagated through the normal distribution, or should I expect some issues because I am taking gradients over the parameters defining a distribution

推荐答案

tl;dr: 是的,梯度反向传播将与 tf.distributions.Normal 一起正常工作.

tl;dr: Yes, gradient back propagation will work correctly with tf.distributions.Normal.

dist.pdf(x) 不会从分布中抽取样本,而是返回 x 处的概率密度函数.这可能不是您想要的.

dist.pdf(x) does not draw a sample from the distribution, but rather returns the probability density function at x. This is probably not what you wanted.

要获得随机样本,您真正想要的是调用 dist.sample().对于许多随机分布,随机样本对参数的依赖是重要的,不一定是可反向传播的.

To get a random sample, what you really want is to call dist.sample(). For many random distributions, the dependency of a random sample on the parameters is nontrivial and will not necessarily be backpropable.

然而,正如@Richard_wth 指出的,特别是对于正态分布,可以通过重新参数化来获得对位置和尺度参数(musigma>).

However, as @Richard_wth pointed out, specifically for the normal distribution, it is possible through reparametrization to get a simple dependency on the location and scale parameters (mu and sigma).

事实上,在 of tf.contrib.distributions.Normal(最近迁移到tf.distributions.Normal),这正是sample实施:

In fact, in the implementation of tf.contrib.distributions.Normal (recently migrated to tf.distributions.Normal), that is exactly how sample is implemented:

def _sample_n(self, n, seed=None):
  ...
  sampled = random_ops.random_normal(shape=shape, mean=0., stddev=1., ...)
  return sampled * self.scale + self.loc

因此,如果您提供尺度和位置参数作为张量,则反向传播将在这些张量上正常工作.

Consequently, if you provide scale and location parameters as tensors, then backpropagation will work correctly on those tensors.

请注意,这种反向传播本质上是随机的:它会根据正常高斯变量的随机抽取而变化.但是,从长远来看(在许多训练示例中),这可能会如您所愿.

Note that this backpropagation is inherently random: It will vary depending on the random draw of the normal Gaussian variable. However, in the long run (over many training examples), this is likely to work as you expect.

这篇关于tensorflow 是否通过 pdf 传播梯度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆