急切执行:梯度计算 [英] Eager execution: gradient computation

查看：23 发布时间：2021/9/5 19:50:19 python tensorflow

本文介绍了急切执行:梯度计算的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想知道为什么这个非常简单的梯度计算不能正常工作.它实际上是生成一个 [None, None] 向量.显然，这不是想要的输出.

I m wondering why is this very simple gradient computation not working correctly. It is actually generating a [None, None] vector. Obviously, this is not the desired output.

import tensorflow as tf
tf.enable_eager_execution()

a = tf.constant(0.)
with tf.GradientTape() as tape:
    b = 2 * a
da, db = tape.gradient(a + b, [a, b])
print(da)
print(db)

推荐答案

您发布的代码片段有两个小问题:

There are two minor issues with the code snippet you posted:

a + b 计算发生在磁带上下文之外，因此没有被记录.请注意，GradientTape 只能区分记录的计算.在磁带上下文中计算 a + b 将解决这个问题.

The a + b computation is happening outside the context of the tape, so it is not being recorded. Note that GradientTape can only differentiate computation that is recorded. Computing a + b inside the tape context will fix that.

需要观察"源张量.有两种方法可以向磁带发出信号表明应该监视张量:(a) 显式调用 tape.watch，或 (b) 使用 tf.Variable(监视所有变量)，请参阅文档

Source tensors need to be "watched". There are two ways to signal to the tape that a tensor should be watched: (a) explicitly invoking tape.watch, or (b) Using a tf.Variable (all variables are watched), see documentation

长话短说，对您的代码段进行两个微不足道的修改即可:

Long story short, two trivial modifications to your snippet do the trick:

import tensorflow as tf
tf.enable_eager_execution()

a = tf.constant(0.)
with tf.GradientTape() as tape:
    tape.watch(a)
    b = 2 * a
    c = a + b
da, db = tape.gradient(c, [a, b])
print(da)
print(db)

希望有所帮助.

这篇关于急切执行:梯度计算的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

急切执行:梯度计算 [英] Eager execution: gradient computation

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

急切执行:梯度计算 [英] Eager execution: gradient computation

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭