多次使用GradientTape进行多个Jacobian计算 [英] Repeated use of GradientTape for multiple Jacobian calculations

查看:281
本文介绍了多次使用GradientTape进行多个Jacobian计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试计算TensorFlow神经网络输出相对于其输入的雅可比行列式.这可以通过tf.GradientTape.jacobian方法轻松实现. TensorFlow文档中提供的简单示例如下:

I am attempting to compute the Jacobian of a TensorFlow neural network's outputs with respect to its inputs. This is easily achieved with the tf.GradientTape.jacobian method. The trivial example provided in the TensorFlow documentation is as follows:

with tf.GradientTape() as g:
  x  = tf.constant([1.0, 2.0])
  g.watch(x)
  y = x * x
jacobian = g.jacobian(y, x)

这很好,如果我只想计算输入张量x的单个实例的雅可比行列式.但是,对于x的各种实例,我需要多次反复评估该雅可比行列式.对于非平凡的Jacobian计算(例如,对于具有非线性激活函数的深层卷积神经网络),重复重新运行GradientTape计算并评估jacobian方法的成本非常高.我从 TensorFlow文档知道,梯度(以及雅可比矩阵)是通过计算得出的自动区分.我必须想象网络的分析梯度有一些内部存储(通过自动微分计算),可以在给定的输入下进行评估.

This is fine if I want only want to compute the Jacobian of a single instance of the input tensor x. However, I need to repeatedly evaluate this Jacobian many, many times for various instances of x. For a non-trivial Jacobian calculation (e.g. for a deep convolutional neural network with non-linear activation functions), this is incredibly expensive to repeatedly rerun the GradientTape calculation and evaluate the jacobian method. I know from the TensorFlow documentation that the gradients (and hence the Jacobian) are computed via automatic differentiation. I have to imagine there is some internal storage of the analytical gradient of the network (computed by automatic differentiation) which is evaluated at the given inputs.

我的问题:我是否假设TensorFlow构建并存储(至少部分)计算雅可比行列式所需的分析梯度,这是正确的吗?如果是这样,是否有一种方法可以保存此分析梯度并用新的输入重新评估Jacobian,而不必通过GradientTape方法进行重构?

My question: am I correct in assuming that TensorFlow builds and stores (at least parts of) the analytical gradients needed to compute the Jacobian? And if so, is there a way to save this analytical gradient and re-evaluate the Jacobian with new inputs without having to reconstruct it via the GradientTape method?

持久性" GradientTape似乎无法解决此问题:它仅允许针对计算的多个内部参数对单个GradientTape实例进行重复评估.

A "persistent" GradientTape does not seem to solve this issue: it only allows for the repeated evaluation of a single GradientTape instance with respect to multiple internal arguments of the computation.

推荐答案

也许您觉得这很有帮助:

Maybe you find this helpful:

我需要多次计算任意函数的jacobian.我的问题是我不恰当地使用了GradientTape,但是我发布的代码可能会帮助您或为您提供一些见识.我发布了一个自包含的示例,该示例使用基于会话的tf.gradient()函数和现代的GriadientTape方法来计算jacobian.在帮助下,我让它们在彼此相同的数量级内运行.

I needed to compute the jacobian of an arbitrary function many, many times. My problem was that I was using GradientTape inappropriately, but the code I posted might help you or give you some insight. I posted a self contained example of calculating the jacobian using both the session based tf.gradient() function and the modern GriadientTape approach. With help, I got them to run within the same order of magnitude of each other.

  • 如果您的问题集中在尝试重用两次调用之间的中间计算以提高速度,那么我认为Nick的答案更适用.
  • 如果您的问题集中在尝试使GradientTape像静态图一样快,则请确保将其包装在@tf.function中,因为这样做就可以了.
  • If your question is focused on trying to reuse the intermediate calculations between calls for a speed boost, then I think Nick's answer is more applicable.
  • If your question is focused on trying to make GradientTape as fast as a static graph, then make sure you wrap it in @tf.function since it does just that.

请参阅我的问题:糟糕的tf.与tf.gradients()相比,GradientTape在计算jacobian方面的性能

这篇关于多次使用GradientTape进行多个Jacobian计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆