TensorFlow 自定义估算器 - 在 model_fn 中进行小幅更改后恢复模型 [英] TensorFlow Custom Estimator - Restore model after small changes in model_fn

查看:30
本文介绍了TensorFlow 自定义估算器 - 在 model_fn 中进行小幅更改后恢复模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 tf.estimator.Estimator 来开发我的模型,

I am using tf.estimator.Estimator for developing my model,

我写了一个 model_fn 并训练了 50,000 次迭代,现在我想对我的 model_fn 做一个小改动,例如添加一个新层.

I wrote a model_fn and trained 50,000 iterations, now I want to make a small change in my model_fn, for example add a new layer.

我不想从头开始训练,我想从50,000个检查点恢复所有旧变量,并从这一点继续训练.当我尝试这样做时,我得到一个 NotFoundError

I don't want to start training from scratch, I want to restore all the old variables from the 50,000 checkpoint, and continue training from this point. When I try to do so I get a NotFoundError

如何使用 tf.estimator.Estimator 做到这一点?

How can this be done with tf.estimator.Estimator?

推荐答案

TL;DR 从上一个检查点加载变量的最简单方法是使用函数 tf.train.init_from_checkpoint().只需在 Estimator 的 model_fn 中调用此函数,即可覆盖相应变量的初始值设定项.

TL;DR The easiest way to load variables from a previous checkpoint is to use the function tf.train.init_from_checkpoint(). Just one call to this function inside the model_fn of your Estimator will override the initializers of the corresponding variables.

更详细地,假设您已经在 MNIST 上训练了第一个具有两个隐藏层的模型,名为 model_fn_1.权重保存在目录 mnist_1 中.

In more details, suppose you have trained a first model with two hidden layers on MNIST, named model_fn_1. The weights are saved in directory mnist_1.

def model_fn_1(features, labels, mode):
    images = features['image']

    h1 = tf.layers.dense(images, 100, activation=tf.nn.relu, name="h1")
    h2 = tf.layers.dense(h1, 100, activation=tf.nn.relu, name="h2")

    logits = tf.layers.dense(h2, 10, name="logits")

    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

    optimizer = tf.train.GradientDescentOptimizer(0.01)
    train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())

    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

# Estimator 1: two hidden layers
estimator_1 = tf.estimator.Estimator(model_fn_1, model_dir='mnist_1')

estimator_1.train(input_fn=train_input_fn, steps=1000)

<小时>

具有三个隐藏层的第二个模型

现在我们要训练一个具有三个隐藏层的新模型 model_fn_2.我们要加载前两个隐藏层 h1h2 的权重.我们使用 tf.train.init_from_checkpoint() 来做到这一点:


Second model with three hidden layers

Now we want to train a new model model_fn_2 with three hidden layers. We want to load the weights for the first two hidden layers h1and h2. We use tf.train.init_from_checkpoint() to do this:

def model_fn_2(features, labels, mode, params):
    images = features['image']

    h1 = tf.layers.dense(images, 100, activation=tf.nn.relu, name="h1")
    h2 = tf.layers.dense(h1, 100, activation=tf.nn.relu, name="h2")
    h3 = tf.layers.dense(h2, 100, activation=tf.nn.relu, name="h3")

    assignment_map = {
        'h1/': 'h1/',
        'h2/': 'h2/'
    }
    tf.train.init_from_checkpoint('mnist_1', assignment_map)

    logits = tf.layers.dense(h3, 10, name="logits")

    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

    optimizer = tf.train.GradientDescentOptimizer(0.01)
    train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())

    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

# Estimator 2: three hidden layers
estimator_2 = tf.estimator.Estimator(model_fn_2, model_dir='mnist_2')

estimator_2.train(input_fn=train_input_fn, steps=1000)

assignment_map 会将检查点中作用域 h1/ 的每个变量加载到新作用域 h1/ 中,与 相同h2/.不要忘记末尾的 / 以使 TensorFlow 知道它是一个可变范围.

The assignment_map will load every variable from scope h1/ in the checkpoint into the new scope h1/, and same with h2/. Don't forget the / at the end to make TensorFlow know it's a variable scope.

我找不到使用预制估算器进行这项工作的方法,因为您无法更改它们的 model_fn.

I couldn't find a way to make this work using pre-made estimators, since you can't change their model_fn.

这篇关于TensorFlow 自定义估算器 - 在 model_fn 中进行小幅更改后恢复模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆