opt.apply_gradients() 在 TensorFlow 中有什么作用? [英] What does opt.apply_gradients() do in TensorFlow?
问题描述
文档对此不太清楚.我想可以通过 opt.compute_gradients(E, [v])
获得的梯度包含每个元素 的
.∂E/∂x = g(x)
v
存储的张量的 xopt.apply_gradients(grads_and_vars)
是否本质上执行 x ← -η·g(x)
,其中 η
是学习率?这意味着如果我想给变量添加一个正的附加变化 p
,我需要改变 g(x) ← g(x) - (1/η)p
,例如像这样:
The documentation is not quite clear about this. I suppose the gradients one can obtain by opt.compute_gradients(E, [v])
contain the ∂E/∂x = g(x)
for each element x
of the tensor that v
stores. Does opt.apply_gradients(grads_and_vars)
essentially execute x ← -η·g(x)
, where η
is the learning rate? That would imply that if I want to add a positive additive change p
to the variable, I would need to need to change g(x) ← g(x) - (1/η)p
, e.g. like this:
opt = tf.train.GradientDescentOptimizer(learning_rate=l)
grads_and_vars = opt.compute_gradients(loss, var_list)
for l, gv in enumerate(grads_and_vars):
grads_and_vars[l] = (gv[0] - (1/l) * p, gv[1])
train_op = opt.apply_gradients(grads_and_vars)
有没有更好的方法来做到这一点?
Is there a better way to do this?
推荐答案
apply_gradients
方法实际应用的更新规则取决于具体的优化器.看一下tf.train.Optimizer
类中apply_gradients
的实现这里.它依赖于在方法 _apply_dense
和 _apply_spares
中实现更新规则的派生类.您所指的更新规则由 GradientDescentOptimizer
实现.
The update rule that the apply_gradients
method actually applies depends on the specific optimizer. Take a look at the implementation of apply_gradients
in the tf.train.Optimizer
class here. It relies on the derived classes implementing the update rule in the methods _apply_dense
and _apply_spares
. The update rule you are referring to is implemented by the GradientDescentOptimizer
.
关于你想要的正向附加更新:如果你调用的 opt
是 GradientDescentOptimizer
的一个实例,那么你确实可以通过
Regarding your desired positive additive update: If what you are calling opt
is an instantiation of GradientDescentOptimizer
, then you could indeed achieve what you want to do by
grads_and_vars = opt.compute_gradients(E, [v])
eta = opt._learning_rate
my_grads_and_vars = [(g-(1/eta)*p, v) for g, v in grads_and_vars]
opt.apply_gradients(my_grads_and_vars)
更优雅的方法可能是编写一个新的优化器(继承自 tf.train.Optimizer
),直接实现您想要的更新规则.
The more elegant way to do this is probably to write a new optimizer (inheriting from tf.train.Optimizer
) that implements your desired update rule directly.
这篇关于opt.apply_gradients() 在 TensorFlow 中有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!