"samples_per_epoch"和"samples_per_epoch"之间有什么区别和"steps_per_epoch"在fit_generator中 [英] What's the difference between "samples_per_epoch" and "steps_per_epoch" in fit_generator

查看:704
本文介绍了"samples_per_epoch"和"samples_per_epoch"之间有什么区别和"steps_per_epoch"在fit_generator中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被这个问题困扰了好几天...

I was confused by this problem for several days...

我的问题是,为什么训练时间之间有如此大的差异,以至于我将生成器的batch_size设置为"1"和"20".

My question is that why the training time has such massive difference between that I set the batch_size to be "1" and "20" for my generator.

如果将 batch_size 设置为 1 ,则 1个时期 大约 180〜200秒. 如果将 batch_size 设置为 20 ,则 1个时期培训时间大约为 3000〜 3200秒.

If I set the batch_size to be 1, the training time of 1 epoch is approximately 180 ~ 200 sec. If I set the batch_size to be 20, the training time of 1 epoch is approximately 3000 ~ 3200 sec.

但是,这些训练时间之间的这种可怕差异似乎是不正常的……,因为它应该是相反的结果: batch_size = 1,训练时间-> 3000〜3200秒. batch_size = 20,训练时间-> 180〜200秒.

However, this horrible difference between these training times seems to be abnormal..., since it should be the reversed result: batch_size = 1, training time -> 3000 ~ 3200 sec. batch_size = 20, training time -> 180 ~ 200 sec.

生成器的输入不是文件路径,而是已经加载到 通过调用"np.load()"来存储内存. 因此,我认为I/O权衡问题不存在.

The input to my generator is not the file path, but the numpy arrays which are already loaded into the memory via calling "np.load()". So I think the I/O trade-off issue doesn't exist.

我正在使用Keras-2.0.3,后端是tensorflow-gpu 1.0.1

I'm using Keras-2.0.3 and my backend is tensorflow-gpu 1.0.1

我已经看到了合并后的 PR 的更新, 但似乎此更改完全不会影响任何事情. (用法与原始用法相同)

I have seen the update of this merged PR, but it seems that this change won't affect anything at all. (the usage is just the same with original one)

链接这是我自定义生成器的要点和fit_generator的一部分

The link here is the gist of my self-defined generator and the part of my fit_generator.

推荐答案

使用fit_generator时,每个时期处理的样本数为batch_size * steps_per_epochs.从fit_generator的Keras文档中: https://keras.io/models/sequential/

When you use fit_generator, the number of samples processed for each epoch is batch_size * steps_per_epochs. From the Keras documentation for fit_generator: https://keras.io/models/sequential/

steps_per_epoch:在声明一个纪元完成并开始下一个纪元之前,要从生成器产生的总步数(一批样本).通常应等于数据集中唯一样本的数量除以批次大小.

steps_per_epoch: Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of unique samples of your dataset divided by the batch size.

这与'fit'的行为不同,在后者中,batch_size的增加通常会加快速度.

This is different from the behaviour of 'fit', where increasing batch_size typically speeds up things.

总而言之,如果您希望训练时间保持不变或更低,那么在使用fit_generator来增加batch_size时,应将steps_per_epochs减少相同的因子.

In conclusion, when you increase batch_size with fit_generator, you should decrease steps_per_epochs by the same factor, if you want training time to stay the same or lower.

这篇关于"samples_per_epoch"和"samples_per_epoch"之间有什么区别和"steps_per_epoch"在fit_generator中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆