使用 tf.data.dataset 为序列模型创建数据生成器 [英] Create data generator with tf.data.dataset for sequence models

本文介绍了使用 tf.data.dataset 为序列模型创建数据生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 RGB 图像的图像数据集:img1.png、img2.png ... img250.png.我从每个图像中提取了 100 个大小为 [64,64,3] 的小块.所以,我现在有像 img1_1.png, img1_2.png ...img1_100.png, img2_1.png, img2_2.png, ... img2_100.png, img3_1, .....

I have an image dataset including RGB images: img1.png, img2.png ... img250.png. I have extracted 100 small patches with size [64,64,3] from each image. So, I have now dataset like img1_1.png, img1_2.png ...img1_100.png, img2_1.png, img2_2.png, ... img2_100.png, img3_1, .....

我想用 tf.data.dataset.from_tensor_slices 创建一个数据生成器,将每个图像的所有补丁传递给 RNN 模型.所以,我希望生成器创建这样的输出:[batch_size, 100, 64, 64, 3]

I want to create a data generator with tf.data.dataset.from_tensor_slices to pass all patches of each image to an RNN model. So, I wanna the generator creates output like this : [batch_size, 100, 64, 64, 3]

我该怎么做?

推荐答案

代码:

# generating data
x = tf.constant(np.random.randint(256, size =(250,64, 64, 3)), dtype = tf.int32)

# Creating a dataset with sequence length
dataset = tf.data.Dataset.from_tensor_slices(x).batch(100, drop_remainder= True)
for i in dataset:
    print(i.shape)

输出:

(100, 64, 64, 3)
(100, 64, 64, 3)

确保 drop_remainders = True

最后,创建所需长度的批量大小.

Finally, create a batch size of the desired length.

# creating dataset with batch_size
dataset = dataset.batch(32)
for i in dataset:
    print(i.shape)

输出:

(2, 100, 64, 64, 3)

如果您的数据大小为 (250,100,64, 64, 3):

If your data size is (250,100,64, 64, 3):

dataset = tf.data.Dataset.from_tensor_slices(x).batch(32)
for i in dataset:
    print(i.shape)

输出:

(32, 100, 64, 64, 3)
(32, 100, 64, 64, 3)
(32, 100, 64, 64, 3)
(32, 100, 64, 64, 3)
(32, 100, 64, 64, 3)
(32, 100, 64, 64, 3)
(32, 100, 64, 64, 3)
(26, 100, 64, 64, 3)

这篇关于使用 tf.data.dataset 为序列模型创建数据生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆