为TensorFlow 2中的许多伪数据实现优化功能 [英] Optimise function for many pseudodata realisations in TensorFlow 2

查看:139
本文介绍了为TensorFlow 2中的许多伪数据实现优化功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的最终目标是模拟似然比测试统计数据,但是,我的核心问题是我不了解如何让TensorFlow 2对不同的数据输入执行许多优化.这是我的尝试,希望它可以使您了解我正在尝试的事情:

My end goal is to simulate likelihood ratio test statistics, however, the core problem I am having is that I do not understand how to get TensorFlow 2 to perform many optimizations for different data inputs. Here is my attempt, hopefully, it gives you the idea of what I am trying:

import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfd
import numpy as np

# Bunch of independent Poisson distributions that we want to combine
poises0 = [tfp.distributions.Poisson(rate = 10) for i in range(5)]

# Construct joint distributions
joint0 = tfd.JointDistributionSequential(poises0)

# Generate samples
N = int(1e3)
samples0 = joint0.sample(N)

# Now we need the same distributions but with floating parameters,
# and need to define the function to be minimised
mus = [tf.Variable(np.random.randn(), name='mu{0}'.format(i)) for i in range(5)]

#@tf.function
def loss():
    poises_free = [tfp.distributions.Poisson(rate = mus[i]) for i in range(5)]
    joint_free = tfd.JointDistributionSequential(poises_free)
    # Construct (half of) test statistic
    return -2*(joint_free.log_prob(samples0))

# Minimise (for all samples? Apparently not?)
opt = tf.optimizers.SGD(0.1).minimize(loss,var_list=mus)

print(mus)
print(loss())
print(opt)
quit()

输出:

[<tf.Variable 'mu0:0' shape=() dtype=float32, numpy=53387.016>, <tf.Variable 'mu1:0' shape=() dtype=float32, numpy=2540.568>, <tf.Variable 'mu2:0' shape=() dtype=float32, numpy=-5136.6226>, <tf.Variable 'mu3:0' shape=() dtype=float32, numpy=-3714.5227>, <tf.Variable 'mu4:0' shape=() dtype=float32, numpy=1062.9396>]
tf.Tensor(
[nan nan nan nan ... nan nan nan], shape=(1000,), dtype=float32)
<tf.Variable 'UnreadVariable' shape=() dtype=int64, numpy=1>

最后我要计算测试统计量

In the end I want to compute the test statistic

q = -2*joint0.log_prob(samples0) - loss()

并表明它具有5个自由度的卡方分布.

and show that it has a chi-squared distribution with 5 degrees of freedom.

我是TensorFlow的新手,所以也许我这样做完全错了,但是希望您对我想要的东西有所了解.

I am new to TensorFlow so perhaps I am doing this entirely wrong, but I hope you get the idea of what I want.

所以我玩了更多,并且我想TensorFlow根本不会像我想象的那样对输入张量进行优化.也许可以,但是我需要进行不同的设置,即一次为所有最小化赋予它一个输入参数张量和一个巨大的联合损失函数?

So I played around a bit more, and I suppose that TensorFlow simply doesn't perform optimizations over the input tensors in parallel like I assumed. Or perhaps it can, but I need to set things up differently, i.e. perhaps give it a tensor of input parameters and a gigantic joint loss function for all the minimizations at once?

我还尝试了一个简单的循环来做事,只是想看看会发生什么.正如预料的那样,这确实很慢,但我什至没有得到正确的答案:

I also tried doing things with a simple loop just to see what happens. As predicted it is pathetically slow, but I also don't even get the right answer:

poises0 = [tfp.distributions.Poisson(rate = 10) for i in range(5)]
joint0 = tfd.JointDistributionSequential(poises0)

N = int(5e2)
samples0 = joint0.sample(N)

mus = [tf.Variable(10., name='mu{0}'.format(i)) for i in range(5)]

#@tf.function
def loss(xi):
    def loss_inner():
        poises_free = [tfp.distributions.Poisson(rate = mus[i]) for i in range(5)]
        joint_free = tfd.JointDistributionSequential(poises_free)
        # Construct (half of) test statistic
        return -2*(joint_free.log_prob(xi))
    return loss_inner

# Minimise
# I think I have to loop over the samples... bit lame. Can perhaps parallelise though.
q = []
for i in range(N):
   xi = [x[i] for x in samples0]
   opt = tf.optimizers.SGD(0.1).minimize(loss=loss(xi),var_list=mus)
   q += [-2*joint0.log_prob(xi) - loss(xi)()]

fig = plt.figure()
ax = fig.add_subplot(111)
sns.distplot(q, kde=False, ax=ax, norm_hist=True)
qx = np.linspace(np.min(q),np.max(q),1000)
qy = np.exp(tfd.Chi2(df=5).log_prob(qx))
sns.lineplot(qx,qy)
plt.show()

输出不是DOF = 5的卡方分布.确实,检验统计量经常具有负值,这意味着优化结果通常比原假设更差,这应该是不可能的.

The output is not a chi-squared distribution with DOF=5. Indeed the test statistic often has negative values, which means that the optimized result is often a worse fit than the null hypothesis, which should be impossible.

这里是怪物"解决方案的尝试,在该解决方案中,我一次为每个伪数据实现最小化了一个由不同输入变量组成的庞大网络.这感觉更像TensorFlow可能擅长的事情,尽管我觉得一旦我进入大量的伪数据集,我将用光RAM.不过,我可能可以遍历一批伪数据.

Here is an attempt at the "monster" solution where I minimize a giant network of different input variables for each pseudodata realization all at once. This feels more like something that TensorFlow might be good at doing, though I feel like I will run out of RAM once I go to large sets of pseudo-data. Still, I can probably loop over batches of pseudo-data.

poises0 = [tfp.distributions.Poisson(rate = 10) for i in range(5)]
joint0 = tfd.JointDistributionSequential(poises0)

N = int(5e3)
samples0 = joint0.sample(N)

mus = [tf.Variable(10*np.ones(N, dtype='float32'), name='mu{0}'.format(i)) for i in range(5)]

poises_free = [tfp.distributions.Poisson(rate = mus[i]) for i in range(5)]
joint_free = tfd.JointDistributionSequential(poises_free)
qM = -2*(joint_free.log_prob(samples0))

@tf.function
def loss():
    return tf.math.reduce_sum(qM,axis=0)

# Minimise
opt = tf.optimizers.SGD(0.1).minimize(loss,var_list=mus)
print("parameters:", mus)
print("loss:", loss())
q0 =-2*joint0.log_prob(samples0)
print("q0:", q0)
print("qM:", qM)
q = q0 - qM

fig = plt.figure()
ax = fig.add_subplot(111)
sns.distplot(q, kde=False, ax=ax, norm_hist=True)
qx = np.linspace(np.min(q),np.max(q),1000)
qy = np.exp(tfd.Chi2(df=5).log_prob(qx))
sns.lineplot(qx,qy)
plt.show()

不幸的是,我现在得到了错误:

Unfortunately I now get the error:

Traceback (most recent call last):
  File "testing3.py", line 35, in <module>
    opt = tf.optimizers.SGD(0.1).minimize(loss,var_list=mus)   
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 298, in minimize
    return self.apply_gradients(grads_and_vars, name=name)
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 396, in apply_gradients
    grads_and_vars = _filter_grads(grads_and_vars)
  File "/home/farmer/anaconda3/envs/general/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 924, in _filter_grads
    ([v.name for _, v in grads_and_vars],))
ValueError: No gradients provided for any variable: ['mu0:0', 'mu1:0', 'mu2:0', 'mu3:0', 'mu4:0'].

我认为

是一种基本的错误.我想我只是不明白TensorFlow如何跟踪需要计算的导数.如果我在损失函数内部而不是外部定义变量,似乎一切正常,但是我需要在外部使用它们,以便以后访问它们的值.所以我想我在这里听不懂.

which I suppose is a basic sort of error. I think I just don't understand how TensorFlow keeps track of the derivatives it needs to compute. It seems like things work if I define variables inside the loss function rather than outside, but I need them outside in order to access their values later. So I guess I don't understand something here.

推荐答案

好的,这就是我的想法.我所缺少的关键是:

Ok so here is what I came up with. The keys things I was missing were:

  1. 将输入变量定义为大张量,以便所有最小化都可以立即发生.
  2. 一次构造所有损失最小的单一组合损失函数
  3. 在损失函数定义中构造用于损失计算的中间变量,以便TensorFlow可以跟踪梯度(我认为minimize函数将损失函数包装在梯度带或类似的磁带中).
  4. 将损失函数定义为类的一部分,以便可以存储中间变量.
  5. minimize仅执行最小化的一步,因此我们需要对其进行多次遍历,直到根据某种准则收敛为止.
  6. 由于Poisson分布的均值小于零的无效性,我遇到了一些NaN.因此,我需要在输入变量中添加约束.
  1. Define input variables as giant tensors so that all minimisations can occur at once.
  2. Construct a single combined loss function for all minimisations at once
  3. Construct intermediate variables for loss computation inside the loss function definition, so that TensorFlow can track the gradients (I think the minimize function wraps the loss function in a gradient tape or some such).
  4. Define the loss function as part of a class so that intermediate variables can be stored.
  5. minimize only does one step of the minimisation, so we need to loop over it lots of times until it converges according to some criterion.
  6. I was running into some NaNs due to the invalid-ness of means less than zero for Poisson distributions. So I needed to add a constraint to the input variables.

有了这个,我现在可以在笔记本电脑上在10秒钟内完成相当于一百万次的最小化,这真是太好了!

With this, I can now do the equivalent of a million minimisations in like 10 seconds on my laptop, which is pretty nice!

import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow_probability import distributions as tfd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

# Bunch of independent Poisson distributions that we want to combine
poises0 = [tfd.Poisson(rate = 10) for i in range(5)]

# Construct joint distributions
joint0 = tfd.JointDistributionSequential(poises0)

N = int(1e6)
samples0 = joint0.sample(N)

class Model(object):
  def __init__(self):
     self.mus = [tf.Variable(10*np.ones(N, dtype='float32'), name='mu{0}'.format(i),
                    constraint=lambda x: tf.clip_by_value(x, 0.000001, np.infty)) for i in range(5)]

  def loss(self):
     poises_free = [tfd.Poisson(rate = self.mus[i]) for i in range(5)]
     joint_free = tfd.JointDistributionSequential(poises_free)
     # Construct (half of) test statistic
     self.qM = -2*(joint_free.log_prob(samples0))
     self.last_loss = tf.math.reduce_sum(self.qM,axis=0)
     return self.last_loss

model = Model()

# Minimise
tol = 0.01 * N
delta_loss = 1e99
prev_loss = 1e99
i = 0
print("tol:", tol)
while delta_loss > tol:
    opt = tf.optimizers.SGD(0.1).minimize(model.loss,var_list=model.mus)
    delta_loss = np.abs(prev_loss - model.last_loss)
    print("i:", i," delta_loss:", delta_loss)
    i+=1
    prev_loss = model.last_loss

q0 =-2*joint0.log_prob(samples0)
q = q0 - model.qM

print("parameters:", model.mus)
print("loss:", model.last_loss)
print("q0:", q0)
print("qM:", model.qM)

fig = plt.figure()
ax = fig.add_subplot(111)
sns.distplot(q, kde=False, ax=ax, norm_hist=True)
qx = np.linspace(np.min(q),np.max(q),1000)
qy = np.exp(tfd.Chi2(df=5).log_prob(qx))
sns.lineplot(qx,qy)
plt.show()

输出:

tol: 10000.0
i: 0  delta_loss: inf
i: 1  delta_loss: 197840.0
i: 2  delta_loss: 189366.0
i: 3  delta_loss: 181456.0
i: 4  delta_loss: 174040.0
i: 5  delta_loss: 167042.0
i: 6  delta_loss: 160448.0
i: 7  delta_loss: 154216.0
i: 8  delta_loss: 148310.0
i: 9  delta_loss: 142696.0
i: 10  delta_loss: 137352.0
i: 11  delta_loss: 132268.0
i: 12  delta_loss: 127404.0
...
i: 69  delta_loss: 11894.0
i: 70  delta_loss: 11344.0
i: 71  delta_loss: 10824.0
i: 72  delta_loss: 10318.0
i: 73  delta_loss: 9860.0
parameters: [<tf.Variable 'mu0:0' shape=(1000000,) dtype=float32, numpy=
array([ 6.5849004, 14.81182  ,  7.506216 , ..., 10.       , 11.491933 ,
       10.760278 ], dtype=float32)>, <tf.Variable 'mu1:0' shape=(1000000,) dtype=float32, numpy=
array([12.881036,  7.506216, 12.881036, ...,  7.506216, 14.186232,
       10.760278], dtype=float32)>, <tf.Variable 'mu2:0' shape=(1000000,) dtype=float32, numpy=
array([16.01586  ,  8.378036 , 12.198007 , ...,  6.5849004, 12.198007 ,
        8.378036 ], dtype=float32)>, <tf.Variable 'mu3:0' shape=(1000000,) dtype=float32, numpy=
array([10.      ,  7.506216, 12.198007, ...,  9.207426, 10.760278,
       11.491933], dtype=float32)>, <tf.Variable 'mu4:0' shape=(1000000,) dtype=float32, numpy=
array([ 8.378036 , 14.81182  , 10.       , ...,  6.5849004, 12.198007 ,
       10.760278 ], dtype=float32)>]
loss: tf.Tensor(20760090.0, shape=(), dtype=float32)
q0: tf.Tensor([31.144037 31.440613 25.355555 ... 24.183338 27.195362 22.123463], shape=(1000000,), dtype=float32)
qM: tf.Tensor([21.74377  21.64162  21.526024 ... 19.488544 22.40428  21.08519 ], shape=(1000000,), dtype=float32)

结果现在是卡方DOF = 5!或至少非常接近.

Result is now chi-squared DOF=5! Or at least pretty close.

这篇关于为TensorFlow 2中的许多伪数据实现优化功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆