如何在保持帧标注尺寸的同时将训练数据集帧转换为5d张量? [英] How do I convert train dataset frames into 5d tensor while maintaining label of frames dimension?

查看:67
本文介绍了如何在保持帧标注尺寸的同时将训练数据集帧转换为5d张量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用image_dataset_from_directory()创建了火车(529003帧),验证(29388帧)和测试(28875帧)数据:

I have used the image_dataset_from_directory() to create my train(529003 frames), validation(29388 frames) and test(28875 frames) data:


train_dataset = image_dataset_from_directory(
    directory=TRAIN_DIR,
    labels="inferred",
    label_mode="categorical",
    class_names=["0", "10", "5"],
    batch_size=32,
    image_size=SIZE,
    seed=SEED,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
)

#Shape of the data
(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None),
 TensorSpec(shape=(None, 3), dtype=tf.float32, name=None))

我正在使用的模型期望数据以5D张量的形式出现(32,5,224,224,3)我正在使用MobileNet和LSTM进行视频分类的迁移学习.

The model I am using expects the data in the form of a 5D tensor (32,5,224,224,3) I am using transfer learning with MobileNet then a LSTM for video classification.

我尝试使用:

train_dataset = train_dataset.batch(5).batch(32)

但是数据集变为6D并且标签的尺寸也增加了

But the dataset becomes 6D and the labels increase in dimension as well

(TensorSpec(shape=(None, None, None, 224, 224, 3), dtype=tf.float32, name=None),
 TensorSpec(shape=(None, None, None, 3), dtype=tf.float32, name=None))

推荐答案

我找到了解决方案,我需要制作一个自定义生成器,该生成器从视频输入生成5D张量,并将序列长度视为5D张量的第5个元素.我从Keras使用的那个image_dataset_from_directory()产生了一个4D张量.

I found the solution I need to make a custom generator that generates 5D Tensors from video input which considers the sequence length as the 5th element of the 5D Tensor. The one I am using from Keras, image_dataset_from_directory() produces a 4D Tensor.

这篇关于如何在保持帧标注尺寸的同时将训练数据集帧转换为5d张量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆