Tensorflow错误:“无法从原始解析张量” [英] Tensorflow Error: "Cannot parse tensor from proto"

查看:272
本文介绍了Tensorflow错误:“无法从原始解析张量”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用张量流创建一个深层的CNN。我已经创建了架构,现在正在培训中。当我开始训练模型时,我使用命令:

I am creating a deep CNN with tensorflow. I have already created the architecture, and now I am in the process of training. When I begin to train the model, I use the command:

sess.run(tf.global_variables_initializer())

调用此命令时,出现下面的错误。我的直觉告诉我,张量形状可能太大而无法解析/初始化。我研究了此错误,并且似乎在线上很少有文档。 此错误是否提供了足够的信息以说明问题所在?谢谢。

When this command is called, I get the error located below. My intuition tells me that maybe the tensor shape is too large to parse/initialize. I have researched this error and there seems to be little documentation online. Does this error give enough information to tell what the problem is? Thank you.

2017-10-25 15:07:54.252194: W C:\tf_jenkins\home\workspace\rel-
win\M\windows\PY\35\tensorflow\core\framework\op_kernel.cc:1182] Invalid 
argument: Cannot parse tensor from proto: dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 16
  }
  dim {
    size: 16
  }
  dim {
    size: 7
  }
  dim {
    size: 3298
  }
  dim {
    size: 3298
  }
}
float_val: 0

2017-10-25 15:07:54.252767: E C:\tf_jenkins\home\workspace\rel-
win\M\windows\PY\35\tensorflow\core\common_runtime\executor.cc:644] Executor 
failed to create kernel. Invalid argument: Cannot parse tensor from proto: 
dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 16
  }
  dim {
    size: 16
  }
  dim {
    size: 7
  }
  dim {
    size: 3298
  }
  dim {
    size: 3298
  }
}
float_val: 0

         [[Node: Variable_737/Adam_1/Initializer/zeros = Const[_class=
["loc:@Variable_737"], dtype=DT_FLOAT, value=<Invalid TensorProto: dtype: 
DT_FLOAT tensor_shape { dim { size: 16 } dim { size: 16 } dim { size: 7 } 
dim { size: 3298 } dim { size: 3298 } } float_val: 0>, 
_device="/job:localhost/replica:0/task:0/cpu:0"]()]]
2017-10-25 15:07:54.320979: W C:\tf_jenkins\home\workspace\rel-
win\M\windows\PY\35\tensorflow\core\framework\op_kernel.cc:1182] Invalid 
argument: Cannot parse tensor from proto: dtype: DT_FLOAT
tensor_shape {
  dim {
    size: 16
  }
  dim {
    size: 16
  }
  dim {
    size: 7
  }
  dim {
    size: 3298
  }
  dim {
    size: 3298
  }
}
float_val: 0


推荐答案

正如@Tarun Wadhwa所说,tensorflow不允许在单个设备上使用大于2 GB的张量。如果您使用的是 dtype ='tf.float32' 78 GB >。

As @Tarun Wadhwa said, tensorflow doesn't allow tensors of size > 2 GB on a single device. Your tensor is of size (19 x 10^9 entries) x 4 bytes = 78 GB if you're using dtype='tf.float32'.

首先,您可以尝试使用 tf.float16。这将使张量在RAM上的大小减半。 (这还会给权重添加一些噪声,这将提供正则化效果-这是一件好事)。您还可以尝试在卷积层中增加 stride 参数。

Firstly, you can try using 'tf.float16'. This would halve the size of your tensor on RAM. (It will also add some noise to the weights which will provide a regularizing effect--which is a good thing). You can also try upping your stride parameter in convolutional layers.

但是您仍然无法满足2 GB允许的限制。在这种情况下,您应该将计算图分布在多个GPU上,然后在那里训练模型。您必须使用带有tf.device 语句的来重新构建代码,这是一个全新的过程。 AWS在其EC2上提供了8个和16个GPU p2实例。

But you still won't meet the 2 GB allowable limit. In which case, you should distribute your computational graph across multiple GPUs and train the model there. You'll have to re-structure your code by using with tf.device statements which is a whole new ballgame. AWS provides 8 and 16 GPU p2 instances on its EC2.

为什么需要使用如此庞大的张量?

Why do you need to work with such humongous tensors?

这篇关于Tensorflow错误:“无法从原始解析张量”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆