如何从 TensorFlow 中的自动微分中截取梯度? [英] How can I intercept the gradient from automatic differentiation in TensorFlow?

查看:33
本文介绍了如何从 TensorFlow 中的自动微分中截取梯度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个后续层,激活 a1a2.有没有办法拦截自动微分从第 2 层传播到第 1 层的梯度,即 ∂E/∂a2?我想改变这个渐变,然后将它传递给第 1 层.

Let's say I have two subsequent layers with activations a1 and a2. Is there a way to intercept the gradients that automatic differentiation propagates from layer 2 to layer 1, i.e. ∂E/∂a2? I would like to change this gradient and then pass it on to layer 1.

推荐答案

来自 tf.train.Optimizer 文档

在应用之前处理渐变.

调用minimize() 负责计算梯度并将它们应用于变量.如果您想在应用之前处理梯度,您可以分三步使用优化器:

Calling minimize() takes care of both computing the gradients and applying them to the variables. If you want to process the gradients before applying them you can instead use the optimizer in three steps:

用compute_gradients()计算梯度.根据需要处理渐变.使用 apply_gradients() 应用处理后的渐变.示例:

Compute the gradients with compute_gradients(). Process the gradients as you wish. Apply the processed gradients with apply_gradients(). Example:

# Create an optimizer.
opt = GradientDescentOptimizer(learning_rate=0.1)

# Compute the gradients for a list of variables.
grads_and_vars = opt.compute_gradients(loss, <list of variables>)

# grads_and_vars is a list of tuples (gradient, variable).  Do whatever you
# need to the 'gradient' part, for example cap them, etc.
capped_grads_and_vars = [(MyCapper(gv[0]), gv[1]) for gv in grads_and_vars]

# Ask the optimizer to apply the capped gradients.
opt.apply_gradients(capped_grads_and_vars)

这篇关于如何从 TensorFlow 中的自动微分中截取梯度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆