错误:张量流:无法匹配检查点的文件 [英] ERROR:tensorflow:Couldn't match files for checkpoint
问题描述
我正在训练一个 tensorflow 模型,在每个 epoch 之后我保存模型状态并腌制一些数组.到目前为止,我的模型做了 2 个时期,保存状态的文件夹包含以下文件:
I am training a tensorflow model, after each epoch I save model state and pickle some arrays. So far my model did 2 epochs and folder with saved states contains following files:
checkpoint
model_e_knihy_preprocessed.txt_e0.ckpt-1134759.data-00000-of-00001
model_e_knihy_preprocessed.txt_e0.ckpt-1134759.index
model_e_knihy_preprocessed.txt_e0.ckpt-1134759.meta
model_e_knihy_preprocessed.txt_e1.ckpt-2269536.data-00000-of-00001
model_e_knihy_preprocessed.txt_e1.ckpt-2269536.index
model_e_knihy_preprocessed.txt_e1.ckpt-2269536.meta
topgrads_e_knihy_preprocessed.txt_[it0].pkl
topgrads_e_knihy_preprocessed.txt_[it1].pkl
toppositions_e_knihy_preprocessed.txt_[it0].pkl
toppositions_e_knihy_preprocessed.txt_[it1].pkl
vocab.txt
我没有移动文件夹,也没有对文件结构进行任何外部修改.checkpoint
文件包含以下内容:
I did not moved the folder, or did any external modifications to file structure. checkpoint
file contains following content:
model_checkpoint_path: "model_e_knihy_preprocessed.txt_e1.ckpt-2269536"
all_model_checkpoint_paths: "model_e_knihy_preprocessed.txt_e0.ckpt-1134759"
all_model_checkpoint_paths: "model_e_knihy_preprocessed.txt_e1.ckpt-2269536"
我通过以下方式恢复模型
I restore models in following way
with tf.Session() as session:
model = Word2Vec(opts, session)
model.saver.restore(session, tf.train.latest_checkpoint(path_to_model))
但是tf.train.latest_checkpoint(path_to_model)
方法中已经存在错误.
ERROR:tensorflow:Couldn't match files for checkpoint /mnt/minerva1/nlp/projects/deep_learning/word2vec/trainedmodels/tf_w2vopt_[CS]ebooks_topgradients_iterative/model_e_knihy_preprocessed.txt_e1.ckpt-2269536
所以我偷看了方法
def latest_checkpoint(checkpoint_dir, latest_filename=None):
ckpt = get_checkpoint_state(checkpoint_dir, latest_filename)
if ckpt and ckpt.model_checkpoint_path:
# Look for either a V2 path or a V1 path, with priority for V2.
v2_path = _prefix_to_checkpoint_path(ckpt.model_checkpoint_path,
saver_pb2.SaverDef.V2)
v1_path = _prefix_to_checkpoint_path(ckpt.model_checkpoint_path,
saver_pb2.SaverDef.V1)
if file_io.get_matching_files(v2_path) or file_io.get_matching_files(
v1_path):
return ckpt.model_checkpoint_path
else:
logging.error("Couldn't match files for checkpoint %s",
ckpt.model_checkpoint_path)
return None
并发现 file_io.get_matching_files(v2_path) 什么也没找到(v2_path 包含值 /mnt/minerva1/nlp/projects/deep_learning/word2vec/trainedmodels/tf_w2vopt_[CS]ebooks_topgradients_iterative/txt_model_process_ebooks_iterative/mnt/minerva1/nlp/projects/deep_learning/word2vec/trainedmodels/tf_w2vopt_[CS]ebooks_topgradients_iterative/t.index
存在于文件夹中!遗憾的是我无法进一步了解,因为此方法的控制会导致 tensorflow 包装器.这是 tensorflow 错误吗?
and found out that file_io.get_matching_files(v2_path) finds nothing (v2_path contains value /mnt/minerva1/nlp/projects/deep_learning/word2vec/trainedmodels/tf_w2vopt_[CS]ebooks_topgradients_iterative/model_e_knihy_preprocessed.txt_e1.ckpt-2269536.index
which is present in the folder! Sadly I could not follow much further, since this method's control leads into tensorflow wrapper. Is this a tensorflow bug?
我使用的是 Tensorflow 版本 1.5.0-rc0.
I am using Tensorflow version 1.5.0-rc0.
推荐答案
所以,答案是不要在文件路径中使用方括号.Tensorflow 无法处理它们.请参阅 https://github.com/tensorflow/tensorflow/issues/6082#issuecomment-265055615.
So, the answer is DO NOT USE SQUARE BRACKETS IN YOUR FILE PATH. Tensorflow can't handle them. See https://github.com/tensorflow/tensorflow/issues/6082#issuecomment-265055615.
这篇关于错误:张量流:无法匹配检查点的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!