如果steps_per_epoch不适合样本数量怎么办? [英] what if steps_per_epoch does not fit into numbers of samples?

查看:1366
本文介绍了如果steps_per_epoch不适合样本数量怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Keras fit_generator,steps_per_epoch应等于可用样本总数除以batch_size.

using Keras fit_generator, steps_per_epoch should be equivalent to the total number available of samples divided by the batch_size.

但是,如果我选择的样本不适合n次,那么生成器或fit_generator会如何反应?它会产生样本直到不能再填充整个batch_size还是只使用较小的batch_size作为最后一个样本?

But how would the generator or the fit_generator react if I choose a batch_size that does not fit n times into the samples? Does it yield samples until it cannot fill a whole batch_size anymore or does it just use a smaller batch_size for the last yield?

为什么要问:我将数据划分为不同大小(不同%)的训练/验证/测试,但是对训练和验证集(尤其是训练和测试集)使用相同的批次大小.由于它们的大小不同,因此我不能保证批次大小适合样本总数.

Why I ask: I divide my data into train/validation/test of different size (different %) but would use the same batch size for train and validation sets but especially for train and test sets. As they are different in size I cannot guarantee that batch size fit into the total amount of samples.

推荐答案

如果是yield

的生成器

由您创建生成器,因此行为由您定义.

If it's your generator with yield

It's you who create the generator, so the behavior is defined by you.

如果steps_per_epoch大于预期的批次,fit将看不到任何内容,它将仅继续请求批次,直到达到步数为止.

If steps_per_epoch is greater than the expected batches, fit will not see anything, it will simply keep requesting batches until it reaches the number of steps.

唯一的事情是:您必须确保生成器是无限的.

The only thing is: you must assure your generator is infinite.

例如,在开始时使用while True:进行此操作.

Do this with while True: at the beginning, for instance.

如果生成器来自ImageDataGenerator,则它实际上是keras.utils.Sequence,并且具有length属性:len(generatorInstance).

If the generator is from an ImageDataGenerator, it's actually a keras.utils.Sequence and it has the length property: len(generatorInstance).

然后您可以检查自己发生的情况:

Then you can check yourself what happens:

remainingSamples = total_samples % batch_size #confirm that this is gerater than 0
wholeBatches = total_samples // batch_size
totalBatches = wholeBatches + 1

if len(generator) == wholeBatches:
    print("missing the last batch")    
elif len(generator) == totalBatches:
    print("last batch included")
else:
    print('weird behavior')

并检查最后一批的大小:

And check the size of the last batch:

lastBatch = generator[len(generator)-1]

if lastBatch.shape[0] == remainingSamples:
    print('last batch contains the remaining samples')
else:
    print('last batch is different')

这篇关于如果steps_per_epoch不适合样本数量怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆