Tensorflow:“ GraphDef不能大于2GB。”分配变量后保存模型时出错 [英] Tensorflow: "GraphDef cannot be larger than 2GB." error when saving model after assigning variables

查看:1452
本文介绍了Tensorflow:“ GraphDef不能大于2GB。”分配变量后保存模型时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用预先训练的模型来热烈地启动另一个有少许差异的模型。简单地,我创建一个新模型,并使用预先训练的模型权重为变量分配相同的名称。但是,保存模型时发生了错误。




追溯(最近一次调用是最近一次):
文件 tf_test .py,第23行,在< module>中
save_path = saver.save(sess, ./model.ckpt)
文件 /usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py ,第1308行,保存
self.export_meta_graph(meta_graph_filename)
文件 /usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py,行1331,在export_meta_graph
graph_def = ops.get_default_graph()。as_graph_def(add_shapes = True)中,
文件 /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ ops.py,第2268行,位于as_graph_def
结果中,_ = self._as_graph_def(from_version,add_shapes)
文件 /usr/local/lib/python2.7/dist-packages/tensorflow/python /framework/ops.py,第223行,在_as_graph_def
中,引发ValueError( GraphDef不能大于2GB。)
ValueError:GraphDef不能大于2GB。



示例代码如下:

 导入张量流为tf 
导入numpy为np

v1 = tf.get_variable( L_enc,[400000,1024])
v2 = tf .get_variable( L_dec,[400000,1024])

init_op = tf.initialize_all_variables()

saver = tf.train.Saver(tf.all_variables())

,其中tf.Session()为sess:
sess.run(init_op)
for v in tf.trainable_variables():
embedding = np.random。统一(-1,1,(400000,1024))
sess.run(v.assign(embedding))
#将变量保存到磁盘。
save_path = saver.save(sess, ./model.ckpt)
print(模型保存在文件中:%s%save_path)


解决方案

Fabrizio 正确指向知道协议缓冲区的大小有2GB的硬限制,但是您可能想知道为什么您的程序达到该限制。问题源于以下几行:

  for tf.trainable_variables()中的v:
embedding = np.random。制服(-1,1,(400000,1024))
sess.run(v.assign(embeddding))

当执行达到 v.assign(embeddding)时,新节点将添加到TensorFlow图。特别是,每个 emdding 数组都会转换为 tf.constant() 张量,它会很大(据我估计约为328MB)。



避免这种情况的最佳方法是使用 tf.train.Saver 。由于模型的结构可能不同,因此您可能需要指定从旧模型中的变量名称到新模型中的 tf.Variable 对象的映射。 / p>




解决问题的另一种方法是预先创建 tf.placeholder() op,用于为每个变量分配值。这可能需要对您的实际代码进行更多的重组,但是以下内容对我有用:

  v1 = tf.get_variable( L_enc ,[400000,1024])
v2 = tf.get_variable( L_dec,[400000,1024])

#定义一个单独的占位符并为每个变量分配op,因此
#,我们可以在不将初始值添加到图中的情况下提供初始值。
vars = [v1,v2]
占位符= [tf.placeholder(tf.float32,shape = [400000,1024])for in in vars]
Assign_ops = [v.assign( p)for zip(var,placeholder)中的(v,p)]

init_op = tf.global_variables_initializer()

saver = tf.train.Saver(tf.all_variables ())

,其中tf.Session()为sess:
sess.run(init_op)
for p,zip中的Assign_op(placeholders,assign_ops):
embedding = np.random.uniform(-1,1,(400000,1024))
sess.run(assign_op,{p:embedding})

#将变量保存到磁盘。
save_path = saver.save(sess, ./model.ckpt)
print(模型保存在文件中:%s%save_path)


I want to use a pretrained model to warmly start another model with a little difference. Simply, I create a new model, and assign the variables with same name with pretrained model weights. But, when saving the model, error occurred.

Traceback (most recent call last): File "tf_test.py", line 23, in <module> save_path = saver.save(sess, "./model.ckpt") File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1308, in save self.export_meta_graph(meta_graph_filename) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1331, in export_meta_graph graph_def=ops.get_default_graph().as_graph_def(add_shapes=True), File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2268, in as_graph_def result, _ = self._as_graph_def(from_version, add_shapes) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2231, in _as_graph_def raise ValueError("GraphDef cannot be larger than 2GB.") ValueError: GraphDef cannot be larger than 2GB.

The example code is as follow:

import tensorflow as tf
import numpy as np

v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])

init_op = tf.initialize_all_variables()

saver = tf.train.Saver(tf.all_variables())

with tf.Session() as sess:
  sess.run(init_op)
  for v in tf.trainable_variables():
    embedding = np.random.uniform(-1, 1, (400000, 1024))
    sess.run(v.assign(embedding))
  # Save the variables to disk.
  save_path = saver.save(sess, "./model.ckpt")
  print("Model saved in file: %s" % save_path)

解决方案

Fabrizio correctly points out that there's a hard 2GB limit on the size of protocol buffers, but you might be wondering why your program hits that limit. The problem stems from these lines:

for v in tf.trainable_variables():
  embedding = np.random.uniform(-1, 1, (400000, 1024))
  sess.run(v.assign(embedding))

When the execution hits v.assign(embedding), new nodes are added to the TensorFlow graph. In particular, each embedding array is converted to a tf.constant() tensor, which will be quite large (approximately 328MB by my estimate).

The best way to avoid this is to load the variables from the previous model directly into your new model using a tf.train.Saver. Since the models might have a different structure, you might need to specify a mapping from the names of variables in the old model to the tf.Variable objects in your new model.


An alternative way to solve your problem would be to pre-create a tf.placeholder() op for assigning a value to each variable. This might require more restructuring of your actual code, but the following worked for me:

v1 = tf.get_variable("L_enc", [400000, 1024])
v2 = tf.get_variable("L_dec", [400000, 1024])

# Define a separate placeholder and assign op for each variable, so
# that we can feed the initial value without adding it to the graph.
vars = [v1, v2]
placeholders = [tf.placeholder(tf.float32, shape=[400000, 1024]) for v in vars]
assign_ops = [v.assign(p) for (v, p) in zip(vars, placeholders)]

init_op = tf.global_variables_initializer()

saver = tf.train.Saver(tf.all_variables())

with tf.Session() as sess:
  sess.run(init_op)
  for p, assign_op in zip(placeholders, assign_ops):
    embedding = np.random.uniform(-1, 1, (400000, 1024))
    sess.run(assign_op, {p: embedding})

  # Save the variables to disk.
  save_path = saver.save(sess, "./model.ckpt")
  print("Model saved in file: %s" % save_path)

这篇关于Tensorflow:“ GraphDef不能大于2GB。”分配变量后保存模型时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆