在TensorFlow中实现梯度下降而不是使用它随附的梯度下降 [英] Implementing gradient descent in TensorFlow instead of using the one provided with it

查看:135
本文介绍了在TensorFlow中实现梯度下降而不是使用它随附的梯度下降的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在TensorFlow中建立分类器时,我想使用具有动量的梯度下降(保持对先前梯度的跟踪).

I want to use gradient descent with momentum (keep track of previous gradients) while building a classifier in TensorFlow.

所以我不想使用tensorflow.train.GradientDescentOptimizer,但是我想使用tensorflow.gradients来计算梯度并跟踪以前的梯度并基于所有梯度来更新权重.

So I don't want to use tensorflow.train.GradientDescentOptimizer but I want to use tensorflow.gradients to calculate gradients and keep track of previous gradients and update the weights based on all of them.

我如何在TensorFlow中做到这一点?

How do I do this in TensorFlow?

推荐答案

TensorFlow具有

TensorFlow has an implementation of gradient descent with momentum.

为回答有关实现自己的优化算法的一般问题,TensorFlow为您提供了用于计算梯度的原语,并使用计算出的梯度更新变量.在模型中,假设loss指定了损失函数,并且var_list是模型中TensorFlow变量的python列表(您可以通过调用 tf.trainable_variables ,则可以按如下所示计算变量的梯度:

To answer your general question about implementing your own optimization algorithm, TensorFlow gives you the primitives to calculate the gradients, and update variables using the calculated gradients. In your model, suppose loss designates the loss function, and var_list is a python list of TensorFlow variables in your model (which you can get by calling tf.all_variables or tf.trainable_variables, then you can calculate the gradients w.r.t your variables as follows :

grads = tf.gradients(loss, var_list)

对于简单的梯度下降,您只需从变量中减去梯度与学习率的乘积.该代码如下所示:

For the simple gradient descent, you would simply subtract the product of the gradient and the learning rate from the variable. The code for that would look as follows :

var_updates = []
for grad, var in zip(grads, var_list):
  var_updates.append(var.assign_sub(learning_rate * grad))
train_op = tf.group(*var_updates)

您可以通过调用sess.run(train_op)训练模型.现在,您可以在实际更新变量之前做各种事情.例如,您可以跟踪一组不同变量中的梯度,并将其用于动量算法.或者,您可以在更新变量之前裁剪渐变.所有这些都是简单的TensorFlow操作,因为梯度张量与您在TensorFlow中计算的其他张量没有区别.请查看实现(动量Adam )中的一些优化算法,以了解如何实现自己的算法.

You can train your model by calling sess.run(train_op). Now, you can do all sorts of things before actually updating your variables. For instance, you can keep track of the gradients in a different set of variables and use it for the momentum algorithm. Or, you can clip your gradients before updating the variables. All these are simple TensorFlow operations because the gradient tensors are no different from other tensors that you compute in TensorFlow. Please look at the implementations (Momentum, RMSProp, Adam) of some the fancier optimization algorithms to understand how you can implement your own.

这篇关于在TensorFlow中实现梯度下降而不是使用它随附的梯度下降的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆