Keras-在fit_generator()中如何使用批次和纪元? [英] Keras - How are batches and epochs used in fit_generator()?

查看:105
本文介绍了Keras-在fit_generator()中如何使用批次和纪元?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个8000帧的视频,我想以每200帧为一批训练Keras模型.我有一个帧生成器,它逐帧循环播放视频,并将(3 x 480 x 640)帧累积为形状为(200, 3, 480, 640)的numpy矩阵X-(批处理大小,rgb,帧高,帧宽度)-每200帧产生一次XY:

I have a video of 8000 frames, and I'd like to train a Keras model on batches of 200 frames each. I have a frame generator that loops through the video frame-by-frame and accumulates the (3 x 480 x 640) frames into a numpy matrix X of shape (200, 3, 480, 640) -- (batch size, rgb, frame height, frame width) -- and yields X and Y every 200th frame:

import cv2
...
def _frameGenerator(videoPath, dataPath, batchSize):
    """
    Yield X and Y data when the batch is filled.
    """
    camera = cv2.VideoCapture(videoPath)
    width = camera.get(3)
    height = camera.get(4)
    frameCount = int(camera.get(7))  # Number of frames in the video file.

    truthData = _prepData(dataPath, frameCount)

    X = np.zeros((batchSize, 3, height, width))
    Y = np.zeros((batchSize, 1))

    batch = 0
    for frameIdx, truth in enumerate(truthData):
        ret, frame = camera.read()
        if ret is False: continue

        batchIndex = frameIdx%batchSize

        X[batchIndex] = frame
        Y[batchIndex] = truth

        if batchIndex == 0 and frameIdx != 0:
            batch += 1
            print "now yielding batch", batch
            yield X, Y

运行方式如下 fit_generator() :

Here's how run fit_generator():

        batchSize = 200
        print "Starting training..."
        model.fit_generator(
            _frameGenerator(videoPath, dataPath, batchSize),
            samples_per_epoch=8000,
            nb_epoch=10,
            verbose=args.verbosity
        )

我的理解是,当模型已经看到samples_per_epoch个样本,并且samples_per_epoch =批次大小*批次数量= 200 * 40时,纪元完成.因此,在训练了0-7999帧的纪元后,下一个纪元将从第0帧开始重新训练.这是正确的吗?

My understanding is an epoch finishes when samples_per_epoch samples have been seen by the model, and samples_per_epoch = batch size * number of batches = 200 * 40. So after training for an epoch on frames 0-7999, the next epoch will start training again from frame 0. Is this correct?

通过这种设置,我希望每个时期将40个批次(每个200帧)从生成器传递到fit_generator;这将是每个纪元总共8000个帧-即samples_per_epoch=8000.然后对于随后的时间段,fit_generator将重新初始化生成器,以便我们从视频开始重新开始训练.但是事实并非如此. 第一个纪元完成后(模型记录批次0-24之后),生成器将在其上次停止的地方继续工作.新纪元不是应该从训练数据集的开头重新开始吗?

With this setup I expect 40 batches (of 200 frames each) to be passed from the generator to fit_generator, per epoch; this would be 8000 total frames per epoch -- i.e., samples_per_epoch=8000. Then for subsequent epochs, fit_generator would reinitialize the generator such that we begin training again from the start of the video. Yet this is not the case. After the first epoch is complete (after the model logs batches 0-24), the generator picks up where it left off. Shouldn't the new epoch start again from the beginning of the training dataset?

如果我对fit_generator的理解有误,请解释.我已经阅读了文档,这个示例,以及这些相关 Keras回购中.

If there is something incorrect in my understanding of fit_generator please explain. I've gone through the documentation, this example, and these related issues. I'm using Keras v1.0.7 with the TensorFlow backend. This issue is also posted in the Keras repo.

推荐答案

第一个纪元完成后(模型记录批次0-24之后),生成器在上次中断的地方进行提取

After the first epoch is complete (after the model logs batches 0-24), the generator picks up where it left off

这是对所发生情况的准确描述.如果要重置或倒带发电机,则必须在内部进行.注意,在许多情况下,keras的行为非常有用.例如,您可以在看到数据的1/2之后结束一个历元,然后在另一半上进行一个历元,如果重置了生成器状态,这将是不可能的(这对于更紧密地监视验证很有用).

This is an accurate description of what happens. If you want to reset or rewind the generator, you'll have to do this internally. Note that keras's behavior is quite useful in many situations. For example, you can end an epoch after seeing 1/2 the data then do an epoch on the other half, which would be impossible if the generator status was reset (which can be useful for monitoring the validation more closely).

这篇关于Keras-在fit_generator()中如何使用批次和纪元?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆