如何在保持帧标注尺寸的同时将训练数据集帧转换为5d张量? [英] How do I convert train dataset frames into 5d tensor while maintaining label of frames dimension?
问题描述
我已经使用image_dataset_from_directory()创建了火车(529003帧),验证(29388帧)和测试(28875帧)数据:
I have used the image_dataset_from_directory() to create my train(529003 frames), validation(29388 frames) and test(28875 frames) data:
train_dataset = image_dataset_from_directory(
directory=TRAIN_DIR,
labels="inferred",
label_mode="categorical",
class_names=["0", "10", "5"],
batch_size=32,
image_size=SIZE,
seed=SEED,
subset=None,
interpolation="bilinear",
follow_links=False,
)
#Shape of the data
(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None),
TensorSpec(shape=(None, 3), dtype=tf.float32, name=None))
我正在使用的模型期望数据以5D张量的形式出现(32,5,224,224,3)我正在使用MobileNet和LSTM进行视频分类的迁移学习.
The model I am using expects the data in the form of a 5D tensor (32,5,224,224,3) I am using transfer learning with MobileNet then a LSTM for video classification.
我尝试使用:
train_dataset = train_dataset.batch(5).batch(32)
但是数据集变为6D并且标签的尺寸也增加了
But the dataset becomes 6D and the labels increase in dimension as well
(TensorSpec(shape=(None, None, None, 224, 224, 3), dtype=tf.float32, name=None),
TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name=None))
推荐答案
我找到了解决方案,我需要制作一个自定义生成器,该生成器从视频输入生成5D张量,并将序列长度视为5D张量的第5个元素.我从Keras使用的那个image_dataset_from_directory()产生了一个4D张量.
I found the solution I need to make a custom generator that generates 5D Tensors from video input which considers the sequence length as the 5th element of the 5D Tensor. The one I am using from Keras, image_dataset_from_directory() produces a 4D Tensor.
这篇关于如何在保持帧标注尺寸的同时将训练数据集帧转换为5d张量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!