在 TensorFlow 中定义自定义梯度时使用操作输入 [英] Using op inputs when defining custom gradients in TensorFlow
问题描述
我正在尝试为我的自定义 TF 操作定义渐变方法.我在网上找到的大多数解决方案似乎都基于 gisthttps://github.com/harpone" rel="nofollow noreferrer">竖琴.我不愿意使用这种方法,因为它使用了不能在 GPU 上运行的 py_func
.我找到了另一个解决方案这里 使用 tf.identity()
看起来更优雅,我认为将在 GPU 上运行.但是,我在访问自定义梯度函数中的操作输入时遇到了一些问题.这是我的代码:
I'm trying to define a gradient method for my custom TF operation. Most of the solutions I have found online seem to based on a gist by harpone. I'm reluctant to use that approach as it uses py_func
which won't run on GPU. I found another solution here that uses tf.identity()
that looks more elegant and I think will run on GPU. However, I have some problems accessing inputs of the ops in my custom gradient function. Here's my code:
@tf.RegisterGradient('MyCustomGradient')
def _custom_gradient(op, gradients):
x = op.inputs[0]
return(x)
def my_op(w):
return tf.pow(w,3)
var_foo = tf.Variable(5, dtype=tf.float32)
bar = my_op(var_foo)
g = tf.get_default_graph()
with g.gradient_override_map({'Identity': 'MyCustomGradient'}):
bar = tf.identity(bar)
g = tf.gradients(bar, var_foo)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(g))
我期待 _custom_gradient()
将输入返回给 op(在本例中为 5),但它似乎返回了 op output x gradient
.我的自定义 my_op 将具有不可微分的操作,例如 tf.sign,我想根据输入定义我的自定义渐变.我究竟做错了什么?
I was expecting _custom_gradient()
to return the input to the op (5 in this example) but instead it seems to return op output x gradient
. My custom my_op will have non-differentiable operations like tf.sign and I'd like to define my custom gradient based on the inputs. What am I doing wrong?
推荐答案
你的代码没有问题:
让我们先做前传:
var_foo = 5
-> bar = 125
-> tf.identity(bar) = 125
现在让我们反向传播:
tf.identity(bar)
相对于它的参数 bar
的梯度等于(根据你的定义)bar
,即, 125
.bar
相对于 var_foo
的梯度等于 var_foo
平方的 3 倍,即 75
.相乘,你得到 9375
,这确实是你代码的输出.
The gradient of tf.identity(bar)
with respect to its argument bar
equals (by your definition) to bar
, that is, 125
. The gradient of bar
with respect to var_foo
equals 3 times the square of var_foo
which is 75
. Multiply, and you get 9375
, which is indeed the output of your code.
op.inputs[0]
包含操作的前向传递值.在这种情况下,identity
操作的前向传递是 125
.
op.inputs[0]
contains the forward-pass value of the op. In this case, the forward pass of the identity
op is 125
.
这篇关于在 TensorFlow 中定义自定义梯度时使用操作输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!