多次使用GradientTape进行多个Jacobian计算 [英] Repeated use of GradientTape for multiple Jacobian calculations
问题描述
我正在尝试计算TensorFlow神经网络输出相对于其输入的雅可比行列式.这可以通过tf.GradientTape.jacobian
方法轻松实现. TensorFlow文档中提供的简单示例如下:
I am attempting to compute the Jacobian of a TensorFlow neural network's outputs with respect to its inputs. This is easily achieved with the tf.GradientTape.jacobian
method. The trivial example provided in the TensorFlow documentation is as follows:
with tf.GradientTape() as g:
x = tf.constant([1.0, 2.0])
g.watch(x)
y = x * x
jacobian = g.jacobian(y, x)
这很好,如果我只想计算输入张量x
的单个实例的雅可比行列式.但是,对于x
的各种实例,我需要多次反复评估该雅可比行列式.对于非平凡的Jacobian计算(例如,对于具有非线性激活函数的深层卷积神经网络),重复重新运行GradientTape计算并评估jacobian
方法的成本非常高.我从 TensorFlow文档知道,梯度(以及雅可比矩阵)是通过计算得出的自动区分.我必须想象网络的分析梯度有一些内部存储(通过自动微分计算),可以在给定的输入下进行评估.
This is fine if I want only want to compute the Jacobian of a single instance of the input tensor x
. However, I need to repeatedly evaluate this Jacobian many, many times for various instances of x
. For a non-trivial Jacobian calculation (e.g. for a deep convolutional neural network with non-linear activation functions), this is incredibly expensive to repeatedly rerun the GradientTape calculation and evaluate the jacobian
method. I know from the TensorFlow documentation that the gradients (and hence the Jacobian) are computed via automatic differentiation. I have to imagine there is some internal storage of the analytical gradient of the network (computed by automatic differentiation) which is evaluated at the given inputs.
我的问题:我是否假设TensorFlow构建并存储(至少部分)计算雅可比行列式所需的分析梯度,这是正确的吗?如果是这样,是否有一种方法可以保存此分析梯度并用新的输入重新评估Jacobian,而不必通过GradientTape方法进行重构?
My question: am I correct in assuming that TensorFlow builds and stores (at least parts of) the analytical gradients needed to compute the Jacobian? And if so, is there a way to save this analytical gradient and re-evaluate the Jacobian with new inputs without having to reconstruct it via the GradientTape method?
持久性" GradientTape似乎无法解决此问题:它仅允许针对计算的多个内部参数对单个GradientTape实例进行重复评估.
A "persistent" GradientTape does not seem to solve this issue: it only allows for the repeated evaluation of a single GradientTape instance with respect to multiple internal arguments of the computation.
推荐答案
也许您觉得这很有帮助:
Maybe you find this helpful:
我需要多次计算任意函数的jacobian.我的问题是我不恰当地使用了GradientTape
,但是我发布的代码可能会帮助您或为您提供一些见识.我发布了一个自包含的示例,该示例使用基于会话的tf.gradient()
函数和现代的GriadientTape
方法来计算jacobian.在帮助下,我让它们在彼此相同的数量级内运行.
I needed to compute the jacobian of an arbitrary function many, many times. My problem was that I was using GradientTape
inappropriately, but the code I posted might help you or give you some insight. I posted a self contained example of calculating the jacobian using both the session based tf.gradient()
function and the modern GriadientTape
approach. With help, I got them to run within the same order of magnitude of each other.
- 如果您的问题集中在尝试重用两次调用之间的中间计算以提高速度,那么我认为Nick的答案更适用.
- 如果您的问题集中在尝试使GradientTape像静态图一样快,则请确保将其包装在
@tf.function
中,因为这样做就可以了.
- If your question is focused on trying to reuse the intermediate calculations between calls for a speed boost, then I think Nick's answer is more applicable.
- If your question is focused on trying to make GradientTape as fast as a static graph, then make sure you wrap it in
@tf.function
since it does just that.
请参阅我的问题:糟糕的tf.与tf.gradients()相比,GradientTape在计算jacobian方面的性能
这篇关于多次使用GradientTape进行多个Jacobian计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!