tensorflow 2.0 自定义训练循环的学习率 [英] Learning rate of custom training loop for tensorflow 2.0

查看:44
本文介绍了tensorflow 2.0 自定义训练循环的学习率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用 tensorflow 2.0 自定义训练循环时,是否有任何函数或方法可以显示学习率?

Are there any functions or methods which can show the learning rate when I use the tensorflow 2.0 custom training loop?

这是张量流指南的示例:

Here is an example of tensorflow guide:

def train_step(images, labels):
  with tf.GradientTape() as tape:
    predictions = model(images)
    loss = loss_object(labels, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

  train_loss(loss)
  train_accuracy(labels, predictions)

如何在模型训练时从优化器中检索当前学习率?

How can I retrieve the current learning rate from the optimizer when the model is training?

如果您能提供任何帮助,我将不胜感激.:)

I will be grateful for any help you can provide. :)

推荐答案

在自定义训练循环设置中,您可以print(optimizer.lr.numpy())获取学习率.

In custom training loop setting, you can print(optimizer.lr.numpy()) to get the learning rate.

如果您使用的是 keras api,您可以定义自己的回调来记录当前的学习率.

If you are using keras api, you can define your own callback that records the current learning rate.

from tensorflow.keras.callbacks import Callback

class LRRecorder(Callback):
    """Record current learning rate. """
    def on_epoch_begin(self, epoch, logs=None):
        lr = self.model.optimizer.lr
        print("The current learning rate is {}".format(lr.numpy()))

# your other callbacks 
callbacks.append(LRRecorder())

更新

w := w - (base_lr*m/sqrt(v))*grad = w - act_lr*grad我们上面得到的学习率是 base_lr.但是,act_lr 在训练过程中是自适应变化的.以Adam优化器为例,act_lrbase_lrmv决定.mv 是参数的第一和第二动量.不同的参数有不同的mv 值.所以如果你想知道act_lr,你需要知道变量的名字.比如你想知道变量Adam/dense/kernelact_lr,可以访问mv 像这样,

Update

w := w - (base_lr*m/sqrt(v))*grad = w - act_lr*grad The learning rate we get above is the base_lr. However, act_lr is adaptive changed during training. Take Adam optimizer as an example, act_lr is determined by base_lr, m and v. m and v are the first and second momentums of parameters. Different parameters have different m and v values. So if you would like to know the act_lr, you need to know the variable's name. For example, you want to know the act_lr of the variable Adam/dense/kernel, you can access the m and v like this,

for var in optimizer.variables():
  if 'Adam/dense/kernel/m' in var.name:
    print(var.name, var.numpy())

  if 'Adam/dense/kernel/v' in var.name:
    print(var.name, var.numpy())

然后您可以使用上述公式轻松计算act_lr.

Then you can easily calculate the act_lr using above formula.

这篇关于tensorflow 2.0 自定义训练循环的学习率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆