加载视频数据集(Keras) [英] Loading a video dataset (Keras)

查看:332
本文介绍了加载视频数据集(Keras)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实施LRCN/C(LSTM)RNN对视频中的情绪进行分类.我的数据集结构分为两个文件夹-"train_set"和"valid_set". 打开时,它们中的任何一个都可以找到3个文件夹,正",负"和惊奇".最后,这3个文件夹中的每个文件夹都有视频文件夹,每个文件夹都是.jpg中视频帧的集合.视频的长度不同,因此,一个视频文件夹可以有200帧,旁边的是1200、700 ...!要加载数据集,我正在使用flow_from_directory.在这里,我需要进行一些澄清:

I'm trying to implement an LRCN/C(LSTM)RNN to classify emotions in videos. My dataset structure is split in two folders - "train_set" and "valid_set". When you open, either of them, you can find 3 folders, "positive", "negative" and "surprise". Lastly, each of these 3 folders has video-folders, each of which is a collection of frames of a video in .jpg. Videos have different length, hence a video-folder can have 200 frames, the one next to it 1200, 700...! To load the dataset I am using flow_from_directory. Here, I need a few clarifications:

  1. 在我的情况下,flow_from_directory是否会依次按1加载视频?他们的框架?
  2. 如果我分批加载,flow_from_directory是否根据视频中图像的顺序排序进行批处理?
  3. 如果我有5个图像的video_1文件夹和3个视频的video_2文件夹,并且批处理大小为7,则flow_from_directory最终将选择两批5和3个视频,否则它将与这些视频重叠,从而拍摄全部5张图像从第一个文件夹+第二个文件夹中的2个?会混合我的视频吗?
  4. 数据集加载线程安全吗?工作人员一个从文件夹1依次获取视频帧,工作人员2从文件夹2依次获取……还是每个工作人员都可以从任何位置和任何文件夹获取帧,这会破坏我的顺序读取吗?
  5. 如果启用了shuffle,它会否改变其读取视频文件夹的顺序,还是会开始从随机文件夹中以随机顺序获取帧?
  6. TimeDisributed层的作用与我无法想象的文档一样?如果我将其应用于CNN的密集层或CNN的每一层怎么办?
  1. Will in my case flow_from_directory load the videos 1 by 1, sequentially? Their frames?
  2. If I load into batches, does flow_from_directory take a batch based on the sequential ordering of the images in a video?
  3. If I have video_1 folder of 5 images and video_2 folder of 3 videos, and a batch size of 7, will flow_from_directory end up selecting two batches of 5 and 3 videos or it will overlap the videos, taking all 5 images from the first folder + 2 of the second? Will it mix my videos?
  4. Is the dataset loading thread-safe? Worker one fetches video frames sequentially from folder 1, worker 2 from folder 2 etc... or each worker can takes frames from anywhere and any folder, which can spoil my sequential reading?
  5. If I enable shuffle, will it shuffle the order in which it would read the video folders or it will start fetching frames in random order from random folders?
  6. What does TimeDisributed layer do as from the documentation I cannot really imagine? What if I apply it to a CNN's dense layer or to each layer of a CNN?

推荐答案

  1. flow_from_directory用于图像而不是电影.它不会理解您的目录结构,也不会创建框架"维度.您需要自己的生成器(通常最好实现 keras.utils.Sequence )

  1. flow_from_directory is made for images, not movies. It will not understand your directory structure and will not create a "frames" dimension. You need your own generator (usually better to implement a keras.utils.Sequence)

您只能在以下情况下分批加载:

You can only load into batches if :

  • 由于电影的长度不同,您是一个一个地加载电影
  • 您用空白帧填充视频,以使它们的长度相同

与1相同.

如果您使自己的生成器实现keras.utils.Sequence(),则只要实现知道每部电影是什么,安全性就将得到保持.

If you make your own generator implementing a keras.utils.Sequence(), the safety will be kept as long as your implementation knows what is each movie.

如果您正在加载图像,它将随机播放图像

It would shuffle images if you were loading images

TimeDistributed允许数据在索引1处具有额外的维数.例如:通常采用(batch_size, ...other dims...)的层将采用(batch_size, extra_dim, ...other dims...).这个额外的维度可能意味着任何东西,不一定是时间,它将保持不变.

TimeDistributed allows data with an extra dimension at index 1. Example: a layer that usually takes (batch_size, ...other dims...) will take (batch_size, extra_dim, ...other dims...). This extra dimension may mean anything, not necessarily time, and it will remain untouched.

  • 递归层不需要这个(除非出于特殊原因您确实希望在其中增加尺寸),它们已经将索引1视为时间.
  • 对于每张图片,CNN的工作原理都完全相同,但是您可以采用(batch_size, video_frames, height, width, channels)
  • 格式来组织数据
  • Recurrent layers don't need this (unless you really want an extra dimension there for unusual reasons), they already consider the index 1 as time.
  • CNNs will work exactly the same, for each image, but you can organize your data in the format (batch_size, video_frames, height, width, channels)

这篇关于加载视频数据集(Keras)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆