Tensorflow - 如何批处理数据集 [英] Tensorflow - How to batch the dataset

查看:53
本文介绍了Tensorflow - 如何批处理数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一个用于数字识别的卷积神经网络.我想训练一个图像数据集,但我不知道如何批量"训练数据.

I am doing a convolution neural network for digit recognition. I want to train an image dataset but I don't know how to "batch" the training data.

我得到两个存储 train_image 和 train_label 的数组:

I get two arrays storing the train_image and train_label:

print train_image.shape
# (73257, 1024)
# where I have 73257 images with size 32x32=1024

print train_label.shape
# (73257, 10)
# Digit '1' has label 1, '9' has label 9 and '0' has label 10

现在,我想对批大小 = 50 的训练数据进行批处理

Now, I want to batch the training data with batch size = 50

    sess.run(tf.initialize_all_variables())
    train_image_batch, train_label_batch = tf.train.shuffle_batch([train_image,
       train_label, batch_size = 50, capacity = 50000, min_after_dequeue = 10000)

当我打印 train_image_batch 时

When I print out the train_image_batch

print train_image_batch
# Tensor("shuffle_batch:0", shape=(50, 73257, 1024), dtype=unit8)

我希望形状应该是 (50, 1024)

我在这里做错了吗?

推荐答案

shuffle_batch 默认期望单个样本.要强制它接受多个样本,请通过 enqueue_many=True.参考 doc

shuffle_batch expects single sample by default. To enforce it to accept multiple samples pass enqueue_many=True. Refer doc

train_image_batch, train_label_batch = tf.train.shuffle_batch(
    [train_image, train_label], batch_size = 50, enqueue_many=True, capacity = 50000, min_after_dequeue = 10000)

print(train_image_batch.shape)

Output:
(50, 1024)

这篇关于Tensorflow - 如何批处理数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆