为什么 dataset.output_shapes 在批处理后返回 demension(none) [英] why dataset.output_shapes returns demension(none) after batching

查看:58
本文介绍了为什么 dataset.output_shapes 在批处理后返回 demension(none)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将 Dataset API 用于 TensorFlow(版本:r1.2)中的输入管道.我构建了我的数据集,并以 128 的批量大小对其进行了批处理.该数据集输入到 RNN 中.

不幸的是,dataset.output_shape 在第一维中返回维度(无),因此 RNN 引发错误:

回溯(最近一次调用最后一次): 中的文件untitled1.py",第 188 行tf.app.run(main=main, argv=[sys.argv[0]] + 未解析)文件/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/platform/app.py",第48行,运行中_sys.exit(main(_sys.argv[:1] + flags_passthrough))文件untitled1.py",第 121 行,在主目录中运行训练()文件untitled1.py",第 57 行,在 run_training 中is_training=真)文件/home/harold/huawei/ConvLSTM/ConvLSTM.py",第216行,inference初始状态=初始状态)文件/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py",第566行,在dynamic_rnn数据类型=数据类型)文件/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py",第636行,_dynamic_rnn_loop输入大小(输入深度)必须可通过形状推断访问,"ValueError:输入大小(输入深度)必须可通过形状推断访问,但看到值无.

我认为这个错误是由输入的形状引起的,第一维应该是批量大小而不是没有.

代码如下:

origin_dataset = Dataset.BetweenS_Dataset(FLAGS.data_path)train_dataset = origin_dataset.train_datasettest_dataset = origin_dataset.test_datasetshuffle_train_dataset = train_dataset.shuffle(buffer_size=10000)shuffle_batch_train_dataset = shuffle_train_dataset.batch(128)batch_test_dataset = test_dataset.batch(FLAGS.batch_size)迭代器 = tf.contrib.data.Iterator.from_structure(shuffle_batch_train_dataset.output_types,shuffle_batch_train_dataset.output_shapes)(图像,标签)= iterator.get_next()training_init_op = iterator.make_initializer(shuffle_batch_train_dataset)test_init_op = iterator.make_initializer(batch_test_dataset)打印(shuffle_batch_train_dataset.output_shapes)

我打印 output_shapes 并给出:

(TensorShape([Dimension(None), Dimension(36), Dimension(100)]), TensorShape([Dimension(None)]))

我想应该是 128,因为我有批处理数据集:

(TensorShape([Dimension(128), Dimension(36), Dimension(100)]), TensorShape([Dimension(128)]))

解决方案

他们在实现中对批量大小进行了硬编码,并且它总是会返回 None (tf 1.3).

def _padded_shape_to_batch_shape(s):返回 tensor_shape.vector(None).concatenate(tensor_util.constant_value_as_shape(s))

通过这种方式,他们可以批量处理所有元素(例如 dataset_size=14batch_size=5last_batch_size=4).

你可以使用 dataset.filter 和 dataset.map 来解决这个问题

d = contrib.data.Dataset.from_tensor_slices([[5] for x in range(14)])批量大小 = 5d = d.batch(batch_size)d = d.filter(lambda e: tf.equal(tf.shape(e)[0], batch_size))def batch_reshape(e):return tf.reshape(e, [args.batch_size] + [s if s is not None else -1 for s in e.shape[1:].as_list()])d = d.map(batch_reshape)r = d.make_one_shot_iterator().get_next()打印('dataset_output_shape = %s' % r.shape)使用 tf.Session() 作为 sess:为真:打印(sess.run(r))

<块引用>

输出

dataset_output_shape = (5, 1)

[[5][5][5][5][5]]

[[5][5][5][5][5]]

OutOfRangeError

I'm using the Dataset API for input pipelines in TensorFlow (version: r1.2). I built my dataset and batched it with a batch size of 128. The dataset fed into the RNN.

Unfortunately, the dataset.output_shape returns dimension(none) in the first dimension, so the RNN raises an error:

Traceback (most recent call last):
  File "untitled1.py", line 188, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "untitled1.py", line 121, in main
    run_training()
  File "untitled1.py", line 57, in run_training
    is_training=True)
  File "/home/harold/huawei/ConvLSTM/ConvLSTM.py", line 216, in inference
    initial_state=initial_state)
  File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 566, in dynamic_rnn
    dtype=dtype)
  File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 636, in _dynamic_rnn_loop
    "Input size (depth of inputs) must be accessible via shape inference,"
ValueError: Input size (depth of inputs) must be accessible via shape inference, but saw value None.

I think this error is caused by the shape of input, the first dimension should be batch size but not none.

here is the code:

origin_dataset = Dataset.BetweenS_Dataset(FLAGS.data_path)
train_dataset = origin_dataset.train_dataset
test_dataset = origin_dataset.test_dataset
shuffle_train_dataset = train_dataset.shuffle(buffer_size=10000)
shuffle_batch_train_dataset = shuffle_train_dataset.batch(128)
batch_test_dataset = test_dataset.batch(FLAGS.batch_size)

iterator = tf.contrib.data.Iterator.from_structure(
                           shuffle_batch_train_dataset.output_types,
                            shuffle_batch_train_dataset.output_shapes)
(images, labels) = iterator.get_next()

training_init_op = iterator.make_initializer(shuffle_batch_train_dataset)
test_init_op = iterator.make_initializer(batch_test_dataset)

print(shuffle_batch_train_dataset.output_shapes)

I print output_shapes and it gives:

(TensorShape([Dimension(None), Dimension(36), Dimension(100)]), TensorShape([Dimension(None)]))

I suppose that it should be 128, because I have batched dataset:

(TensorShape([Dimension(128), Dimension(36), Dimension(100)]), TensorShape([Dimension(128)]))

解决方案

They hardcoded batch size in implementation and it always will return None (tf 1.3).

def _padded_shape_to_batch_shape(s):
  return tensor_shape.vector(None).concatenate(
      tensor_util.constant_value_as_shape(s))

In this way, they can batch all elements (e.g. dataset_size=14, batch_size=5, last_batch_size=4).

You can use dataset.filter and dataset.map to fix this issue

d = contrib.data.Dataset.from_tensor_slices([[5] for x in range(14)])
batch_size = 5
d = d.batch(batch_size)
d = d.filter(lambda e: tf.equal(tf.shape(e)[0], batch_size))
def batch_reshape(e):
    return  tf.reshape(e, [args.batch_size] + [s if s is not None else -1 for s in e.shape[1:].as_list()])
d = d.map(batch_reshape)
r = d.make_one_shot_iterator().get_next()
print('dataset_output_shape = %s' % r.shape)
with tf.Session() as sess:
    while True:
        print(sess.run(r))

Output

dataset_output_shape = (5, 1)

[[5][5][5][5][5]]

[[5][5][5][5][5]]

OutOfRangeError

这篇关于为什么 dataset.output_shapes 在批处理后返回 demension(none)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆