恢复 TensorFlow 模型 [英] Restoring TensorFlow model
问题描述
我正在尝试恢复 TensorFlow 模型.我按照这个例子:http://nasdag.github.io/博客/2016/01/19/classifying-bees-with-google-tensorflow/
I'm trying to restore TensorFlow model. I followed this example: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/
在示例代码的末尾,我添加了以下几行:
At the end of the code in the example I added these lines:
saver = tf.train.Saver()
save_path = saver.save(sess, "model.ckpt")
print("Model saved in file: %s" % save_path)
创建了两个文件:checkpoint 和 model.ckpt.
Two files were created: checkpoint and model.ckpt.
在一个新的 python 文件 (tomas_bees_predict.py) 中,我有这个代码:
In a new python file (tomas_bees_predict.py), I have this code:
import tensorflow as tf
saver = tf.train.Saver()
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "model.ckpt")
print("Model restored.")
但是,当我执行代码时,出现此错误:
However when I execute the code, I get this error:
Traceback (most recent call last):
File "tomas_bees_predict.py", line 3, in <module>
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 705, in __init__
raise ValueError("No variables to save")
ValueError: 没有要保存的变量
ValueError: No variables to save
有没有办法读取mode.ckpt文件并查看保存了哪些变量?或者也许有人可以帮助保存模型并根据上述示例进行恢复?
Is there a way to read mode.ckpt file and see what variables are saved? Or maybe someone can help with saving the model and restoring it based on the example described above?
编辑 1:
我想我尝试运行相同的代码以重新创建模型结构,但出现错误.我认为这可能与此处描述的代码未使用命名变量有关:http://nasdag.github.io/博客/2016/01/19/classifying-bees-with-google-tensorflow/
I think I tried running the same code in order to recreate model structure and I was getting the error. I think it could be related to the fact that code described here isn't using named variables: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
所以我做了这个实验.我写了两个版本的代码(有和没有命名变量)来保存模型和恢复模型的代码.
So I did this experiment. I wrote two versions of the code (with and without named variables) to save the model and the code to restore the model.
tensor_save_named_vars.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(1, name="v1")
v2 = tf.Variable(2, name="v2")
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
tensor_save_not_named_vars.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(1)
v2 = tf.Variable(2)
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
tensor_restore.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(0, name="v1")
v2 = tf.Variable(0, name="v2")
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print "Model restored."
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
这是我执行此代码时得到的结果:
Here is what I get when I execute this code:
$ python tensor_save_named_vars.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
v1 = 1
v2 = 2
Model saved in file: /tmp/model.ckpt
$ python tensor_restore.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
Model restored.
v1 = 1
v2 = 2
$ python tensor_save_not_named_vars.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
v1 = 1
v2 = 2
Model saved in file: /tmp/model.ckpt
$ python tensor_restore.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v1" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice/tensor_name, save/restore_slice/shape_and_slice)]]
Traceback (most recent call last):
File "tensor_restore.py", line 14, in <module>
saver.restore(sess, "/tmp/model.ckpt")
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 891, in restore
sess.run([self._restore_op_name], {self._filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 368, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 444, in _do_run
e.code)
tensorflow.python.framework.errors.NotFoundError: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
Caused by op u'save/restore_slice_1', defined at:
File "tensor_restore.py", line 8, in <module>
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 713, in __init__
restore_sequentially=restore_sequentially)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 432, in build
filename_tensor, vars_to_save, restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 191, in _AddRestoreOps
values = self.restore_op(filename_tensor, vs, preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 106, in restore_op
preferred_shard=preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 189, in _restore_slice
preferred_shard, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 271, in _restore_slice
preferred_shard=preferred_shard, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 664, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1834, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1043, in __init__
self._traceback = _extract_stack()
所以也许原始代码(见上面的外部链接)可以修改成这样:
So perhaps the original code (see the external link above) could be modified to something like this:
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
weight_var = tf.Variable(initial, name="weight_var")
return weight_var
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
bias_var = tf.Variable(initial, name="bias_var")
return bias_var
但是我的问题是:恢复 weight_var 和 bias_var 变量是否足以实现预测?我在配备 GPU 的强大机器上进行了训练,我想将模型复制到没有 GPU 的功能较弱的计算机上以运行预测.
But then the question I have: is restoring weight_var and bias_var variables sufficient to implement the prediction? I did the training on the powerful machine with GPU and I would like to copy the model to the less powerful computer without GPU to run predictions.
推荐答案
我想我尝试运行相同的代码以重新创建模型结构,但出现错误.我认为这可能与此处描述的代码未使用命名变量有关:http://nasdag.github.io/博客/2016/01/19/classifying-bees-with-google-tensorflow/
I think I tried running the same code in order to recreate model structure and I was getting the error. I think it could be related to the fact that code described here isn't using named variables: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
所以我做了这个实验.我写了两个版本的代码(有和没有命名变量)来保存模型和恢复模型的代码.
So I did this experiment. I wrote two versions of the code (with and without named variables) to save the model and the code to restore the model.
tensor_save_named_vars.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(1, name="v1")
v2 = tf.Variable(2, name="v2")
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
tensor_save_not_named_vars.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(1)
v2 = tf.Variable(2)
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
tensor_restore.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(0, name="v1")
v2 = tf.Variable(0, name="v2")
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print "Model restored."
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
这是我执行此代码时得到的结果:
Here is what I get when I execute this code:
$ python tensor_save_named_vars.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
v1 = 1
v2 = 2
Model saved in file: /tmp/model.ckpt
$ python tensor_restore.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
Model restored.
v1 = 1
v2 = 2
$ python tensor_save_not_named_vars.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
v1 = 1
v2 = 2
Model saved in file: /tmp/model.ckpt
$ python tensor_restore.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v1" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice/tensor_name, save/restore_slice/shape_and_slice)]]
Traceback (most recent call last):
File "tensor_restore.py", line 14, in <module>
saver.restore(sess, "/tmp/model.ckpt")
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 891, in restore
sess.run([self._restore_op_name], {self._filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 368, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 444, in _do_run
e.code)
tensorflow.python.framework.errors.NotFoundError: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
Caused by op u'save/restore_slice_1', defined at:
File "tensor_restore.py", line 8, in <module>
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 713, in __init__
restore_sequentially=restore_sequentially)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 432, in build
filename_tensor, vars_to_save, restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 191, in _AddRestoreOps
values = self.restore_op(filename_tensor, vs, preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 106, in restore_op
preferred_shard=preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 189, in _restore_slice
preferred_shard, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 271, in _restore_slice
preferred_shard=preferred_shard, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 664, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1834, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1043, in __init__
self._traceback = _extract_stack()
所以也许原始代码(见上面的外部链接)可以修改成这样:
So perhaps the original code (see the external link above) could be modified to something like this:
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
weight_var = tf.Variable(initial, name="weight_var")
return weight_var
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
bias_var = tf.Variable(initial, name="bias_var")
return bias_var
但是我的问题是:恢复 weight_var 和 bias_var 变量是否足以实现预测?我在配备 GPU 的强大机器上进行了训练,我想将模型复制到没有 GPU 的功能较弱的计算机上以运行预测.
But then the question I have: is restoring weight_var and bias_var variables sufficient to implement the prediction? I did the training on the powerful machine with GPU and I would like to copy the model to the less powerful computer without GPU to run predictions.
这篇关于恢复 TensorFlow 模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!