为什么在Keras中Adam.iterations总是设置为0? [英] Why is Adam.iterations always set to 0 in Keras?

查看:247
本文介绍了为什么在Keras中Adam.iterations总是设置为0?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试通过keras/tensorflow进入神经网络构建并解决一些示例问题.目前,我尝试了解如何通过model.save()/.load()正确保存和加载当前模型.我希望,如果一切都正确设置,加载一个预先训练的模型并继续训练,就不会破坏我以前的准确性,而只是继续我刚离开的地方.

I am currently trying to get into neural network building via keras/tensorflow and working through some example problems. At the moment I try to understand how to properly save and load my current model via model.save()/.load(). I would expect that, should everything be set up properly, loading a pre-trained model and continue the training should not spoil my prior accuracies and simply continue exactly where I left off.

但是,事实并非如此.加载模型后,我的精度开始出现很大波动,需要一段时间才能真正恢复到以前的精度:

However, it doesn't. My accuracies start to largely fluctuate after I load the model and need a while to actually return to my prior accuracies:

首次运行

继续运行

在深入探讨了各种可能的解释之后(没有一个适用于我的发现),我认为我想出了原因:

After digging through various threads with possible explanations (none of them were applicable to my findings) I think I figured out the reason:

<罢工> 我使用tf.keras.optimizers.Adam进行重量优化,并检查了其初始化程序

I use tf.keras.optimizers.Adam for my weight optimization and after checking its initializer

  def __init__(self, [...], **kwargs):
    super(Adam, self).__init__(**kwargs)
    with K.name_scope(self.__class__.__name__):
      self.iterations = K.variable(0, dtype='int64', name='iterations')
      [...]

  def get_config(self):
    config = {
        'lr': float(K.get_value(self.lr)),
        'beta_1': float(K.get_value(self.beta_1)),
        'beta_2': float(K.get_value(self.beta_2)),
        'decay': float(K.get_value(self.decay)),
        'epsilon': self.epsilon,
        'amsgrad': self.amsgrad
    }

当将整个模型保存为配置字典的一部分时,似乎迭代"计数器始终重置为0,并且既不存储也不加载其当前值.这似乎与model.save保存优化程序的状态,允许您从上次停止的位置继续进行训练"这一说法相矛盾.(

it seems as if the "iterations" counter is always reset to 0 and its current value is neither stored nor loaded when the entire model is saved as its not part of the config dict. This seems to contradict the statement that model.save saves "the state of the optimizer, allowing to resume training exactly where you left off." (https://keras.io/getting-started/faq/). Since the iterations counter is the one which steers the exponential "dropout" of the learning rate in the Adam algorithm

          1. / (1. + self.decay * math_ops.cast(self.iterations,
                                                K.dtype(self.decay))))

即使我将model.fit()中的"initial_epoch"参数设置为保存模型的实际历元号,我的模型也将始终以初始的大"学习率重新启动. ).

my model will always restart with the initial "large" learning rate, even if I set the "initial_epoch" parameter in model.fit() to the actual epoch number where my model was saved (see images uploaded above).

所以我的问题是:

  • 这是预期的行为吗?
  • 如果是这样,这与keras常见问题解答中引用的那句model.save()重新开始您刚离开的地方进行训练"相符吗?
  • 有没有一种方法可以真正保存和恢复Adam优化器,包括迭代计数器,而无需编写自己的优化器(我已经发现这是可能的解决方案,但我想知道是否真的没有更简单的方法)
  • Is this intended behavior?
  • If so, how is this in agreement with the cited statement from the keras FAQ that model.save() "resumes training exactly where you left off"?
  • Is there a way to actually save and restore the Adam optimizer including the iterations counter without writing my own optimizer (I already discovered that this is a possible solution but I was wondering if there really is no simpler method)

修改 我找到了原因/解决方案:我在load_model之后调用了model.compile,这会在保持权重的同时重置优化器(另请参见

Edit I found the reason/solution: I called model.compile after load_model and this resets the optimizer while keeping the weights (see also Does model.compile() initialize all the weights and biases in Keras (tensorflow backend)? )

推荐答案

如下面的代码片段所示,恢复了iterations值.

The iterations value is restored as can be seen in the code snippet below.

model.save('dense_adam_keras.h5')
mdl = load_model('dense_adam_keras.h5')

print('iterations is ', K.get_session().run(mdl.optimizer.iterations))

iterations is  46

调用'load_model'时,将调用deserialize方法创建优化器对象,然后调用set_weights方法以从保存的权重中恢复优化器状态.

When the 'load_model' is called, deserialize method is invoked to create the optimizer object and then set_weights method is called to restore the optimizer state from the saved weights.

https://github.com/keras -team/keras/blob/613aeff37a721450d94906df1a3f3cc51e2299d4/keras/optimizers.py#L742

https://github.com/keras -team/keras/blob/613aeff37a721450d94906df1a3f3cc51e2299d4/keras/optimizers.py#L103

https://github.com/tensorflow /tensorflow/blob/master/tensorflow/python/keras/optimizers.py

这篇关于为什么在Keras中Adam.iterations总是设置为0?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆