恢复使用迭代器的 Tensorflow 模型 [英] Restoring a Tensorflow model that uses Iterators

查看:30
本文介绍了恢复使用迭代器的 Tensorflow 模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用迭代器训练我的网络的模型;遵循 Google 现在推荐的新数据集 API 管道模型.

I have a model that's trains my network using an Iterator; following the new Dataset API pipeline model that's now recommended by Google.

我读取了 tfrecord 文件,将数据提供给网络,训练得很好,一切顺利,我在训练结束时保存了我的模型,以便稍后对其运行推理.代码的简化版本如下:

I read tfrecord files, feed data to the network, train nicely, and all is going well, I save my model in the end of the training so I can run Inference on it later. A simplified version of the code is as following:

""" Training and saving """

training_dataset = tf.contrib.data.TFRecordDataset(training_record)
training_dataset = training_dataset.map(ds._path_records_parser)
training_dataset = training_dataset.batch(BATCH_SIZE)
with tf.name_scope("iterators"):
  training_iterator = Iterator.from_structure(training_dataset.output_types, training_dataset.output_shapes)
  next_training_element = training_iterator.get_next()
  training_init_op = training_iterator.make_initializer(training_dataset)

def train(num_epochs):
  # compute for the number of epochs
  for e in range(1, num_epochs+1):
    session.run(training_init_op) #initializing iterator here
    while True:
      try:
        images, labels = session.run(next_training_element)
        session.run(optimizer, feed_dict={x: images, y_true: labels})
      except tf.errors.OutOfRangeError:
        saver_name = './saved_models/ucf-model'
        print("Finished Training Epoch {}".format(e))
        break



    """ Restoring """
# restoring the saved model and its variables
session = tf.Session()
saver = tf.train.import_meta_graph(r'saved_models\ucf-model.meta')
saver.restore(session, tf.train.latest_checkpoint('.\saved_models'))
graph = tf.get_default_graph()

# restoring relevant tensors/ops
accuracy = graph.get_tensor_by_name("accuracy/Mean:0") #the tensor that when evaluated returns the mean accuracy of the batch
testing_iterator = graph.get_operation_by_name("iterators/Iterator") #my iterator used in testing.
next_testing_element = graph.get_operation_by_name("iterators/IteratorGetNext") #the GetNext operator for my iterator
# loading my testing set tfrecords
testing_dataset = tf.contrib.data.TFRecordDataset(testing_record_path)
testing_dataset = testing_dataset.map(ds._path_records_parser, num_threads=4, output_buffer_size=BATCH_SIZE*20)
testing_dataset = testing_dataset.batch(BATCH_SIZE)

testing_init_op = testing_iterator.make_initializer(testing_dataset) #to initialize the dataset

with tf.Session() as session:
  session.run(testing_init_op)
  while True:
    try:
      images, labels = session.run(next_testing_element)
      accuracy = session.run(accuracy, feed_dict={x: test_images, y_true: test_labels}) #error here, x, y_true not defined
    except tf.errors.OutOfRangeError:
      break

我的问题主要是我恢复模型的时候.如何将测试数据提供给网络?

My problem is mainly when I restore the model. How to feed testing data to the network?

  • 当我使用 testing_iterator = graph.get_operation_by_name("iterators/Iterator")next_testing_element = graph.get_operation_by_name("iterators/IteratorGetNext") 恢复我的迭代器时,我得到以下错误:GetNext() 失败,因为迭代器尚未初始化.在获取下一个元素之前,确保您已经为此迭代器运行了初始化操作.
  • 所以我确实尝试使用以下方法初始化我的数据集:testing_init_op = testing_iterator.make_initializer(testing_dataset)).我收到此错误:AttributeError: 'Operation' object has no attribute 'make_initializer'
  • When I restore my Iterator using testing_iterator = graph.get_operation_by_name("iterators/Iterator"), next_testing_element = graph.get_operation_by_name("iterators/IteratorGetNext"), I get the following error: GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
  • So I did try to initialize my dataset using: testing_init_op = testing_iterator.make_initializer(testing_dataset)). I got this error: AttributeError: 'Operation' object has no attribute 'make_initializer'

另一个问题是,由于使用了迭代器,因此无需在 training_model 中使用占位符,因为迭代器将数据直接提供给图形.但是这样,当我将数据提供给准确度"操作时,如何在第 3 行到最后一行恢复我的 feed_dict 键?

Another issue is, since an iterator is being used, there's no need to use placeholders in the training_model, as an iterator feed data directly to the graph. But this way, how to restore my feed_dict keys in the 3rd to last line, when I feed data to the "accuracy" op?

如果有人可以建议一种在迭代器和网络输入之间添加占位符的方法,那么我可以尝试通过评估准确度"张量来运行图形,同时将数据提供给占位符并完全忽略迭代器.

if someone could suggest a way to add placeholders between the Iterator and the network input, then I could try running the graph by evaluating the "accuracy" tensor while feeding data to the placeholders and ignoring the iterator altogether.

推荐答案

恢复保存的元图时,可以恢复带有名称的初始化操作,然后再次使用它来初始化输入管道以进行推理.

When restoring a saved meta graph, you can restore the initialization operation with name and then use it again to initialize the input pipeline for inference.

也就是在创建图的时候可以做

That is, when creating the graph, you can do

    dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')

然后通过执行以下操作恢复此操作:

And then restore this operation by doing:

    dataset_init_op = graph.get_operation_by_name('dataset_init')

这是一个自包含的代码片段,它比较了恢复前后随机初始化模型的结果.

Here is a self contained code snippet that compares results of a randomly initialized model before and after restoring.

np.random.seed(42)
data = np.random.random([4, 4])
X = tf.placeholder(dtype=tf.float32, shape=[4, 4], name='X')
dataset = tf.data.Dataset.from_tensor_slices(X)
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_next_op = iterator.get_next()

# name the operation
dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')

w = np.random.random([1, 4])
W = tf.Variable(w, name='W', dtype=tf.float32)
output = tf.multiply(W, dataset_next_op, name='output')     
sess = tf.Session()
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer())
sess.run(dataset_init_op, feed_dict={X:data})
while True:
    try:
        print(sess.run(output))
    except tf.errors.OutOfRangeError:
        saver.save(sess, 'tmp/', global_step=1002)
    break

然后你可以恢复相同的模型进行推理,如下所示:

And then you can restore the same model for inference as follows:

np.random.seed(42)
data = np.random.random([4, 4])
tf.reset_default_graph()
sess = tf.Session()
saver = tf.train.import_meta_graph('tmp/-1002.meta')
ckpt = tf.train.get_checkpoint_state(os.path.dirname('tmp/checkpoint'))
saver.restore(sess, ckpt.model_checkpoint_path)
graph = tf.get_default_graph()

# Restore the init operation
dataset_init_op = graph.get_operation_by_name('dataset_init')

X = graph.get_tensor_by_name('X:0')
output = graph.get_tensor_by_name('output:0')
sess.run(dataset_init_op, feed_dict={X:data})
while True:
try:
    print(sess.run(output))
except tf.errors.OutOfRangeError:
    break

这篇关于恢复使用迭代器的 Tensorflow 模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆