为什么 fit_generator 的准确性与 Keras 中的evaluate_generator 的准确性不同? [英] Why is accuracy from fit_generator different to that from evaluate_generator in Keras?

查看:35
本文介绍了为什么 fit_generator 的准确性与 Keras 中的evaluate_generator 的准确性不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的工作:

  • 我正在使用 Keras fit_generator() 训练一个预训练的 CNN.这会在每个 epoch 之后产生评估指标(loss, acc, val_loss, val_acc).训练模型后,我使用 evaluate_generator() 生成评估指标(loss, acc).
  • I am training a pre-trained CNN with Keras fit_generator(). This produces evaluation metrics (loss, acc, val_loss, val_acc) after each epoch. After training the model, I produce evaluation metrics (loss, acc) with evaluate_generator().

我的期望:

  • 如果我为一个时期训练模型,我希望使用 fit_generator()evaluate_generator() 获得的指标是相同的.他们都应该根据整个数据集得出指标.
  • If I train the model for one epoch, I would expect that the metrics obtained with fit_generator() and evaluate_generator() are the same. They both should derive the metrics based on the entire dataset.

我观察到的:

  • lossacc 都不同于 fit_generator()evaluate_generator():
  • Both loss and acc are different from fit_generator() and evaluate_generator():

我不明白的地方:

  • 为什么 fit_generator() 的准确度是不同于 evaluate_generator()
  • Why the accuracy from fit_generator() is different to that from evaluate_generator()

我的代码:

def generate_data(path, imagesize, nBatches):
    datagen = ImageDataGenerator(rescale=1./255)
    generator = datagen.flow_from_directory
        (directory=path,                                        # path to the target directory
         target_size=(imagesize,imagesize),                     # dimensions to which all images found will be resize
         color_mode='rgb',                                      # whether the images will be converted to have 1, 3, or 4 channels
         classes=None,                                          # optional list of class subdirectories
         class_mode='categorical',                              # type of label arrays that are returned
         batch_size=nBatches,                                   # size of the batches of data
         shuffle=True)                                          # whether to shuffle the data
    return generator

[...]

def train_model(model, nBatches, nEpochs, trainGenerator, valGenerator, resultPath):
    history = model.fit_generator(generator=trainGenerator,
                                  steps_per_epoch=trainGenerator.samples//nBatches,     # total number of steps (batches of samples)
                                  epochs=nEpochs,                   # number of epochs to train the model
                                  verbose=2,                        # verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch
                                  callbacks=None,                   # keras.callbacks.Callback instances to apply during training
                                  validation_data=valGenerator,     # generator or tuple on which to evaluate the loss and any model metrics at the end of each epoch
                                  validation_steps=
                                  valGenerator.samples//nBatches,   # number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch
                                  class_weight=None,                # optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function
                                  max_queue_size=10,                # maximum size for the generator queue
                                  workers=32,                       # maximum number of processes to spin up when using process-based threading
                                  use_multiprocessing=True,         # whether to use process-based threading
                                  shuffle=False,                     # whether to shuffle the order of the batches at the beginning of each epoch
                                  initial_epoch=0)                  # epoch at which to start training
    print("%s: Model trained." % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))

    # Save model
    modelPath = os.path.join(resultPath, datetime.now().strftime('%Y-%m-%d_%H-%M-%S') + '_modelArchitecture.h5')
    weightsPath = os.path.join(resultPath, datetime.now().strftime('%Y-%m-%d_%H-%M-%S') + '_modelWeights.h5')
    model.save(modelPath)
    model.save_weights(weightsPath)
    print("%s: Model saved." % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    return history, model

[...]

def evaluate_model(model, generator):
    score = model.evaluate_generator(generator=generator,           # Generator yielding tuples
                                     steps=
                                     generator.samples//nBatches)   # number of steps (batches of samples) to yield from generator before stopping

    print("%s: Model evaluated:"
          "
						 Loss: %.3f"
          "
						 Accuracy: %.3f" %
          (datetime.now().strftime('%Y-%m-%d_%H-%M-%S'),
           score[0], score[1]))

[...]

def main():
    # Create model
    modelUntrained = create_model(imagesize, nBands, nClasses)

    # Prepare training and validation data
    trainGenerator = generate_data(imagePathTraining, imagesize, nBatches)
    valGenerator = generate_data(imagePathValidation, imagesize, nBatches)

    # Train and save model
    history, modelTrained = train_model(modelUntrained, nBatches, nEpochs, trainGenerator, valGenerator, resultPath)

    # Evaluate on validation data
    print("%s: Model evaluation (valX, valY):" % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    evaluate_model(modelTrained, valGenerator)

    # Evaluate on training data
    print("%s: Model evaluation (trainX, trainY):" % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    evaluate_model(modelTrained, trainGenerator)

<小时>

更新

我发现一些网站报告了这个问题:

I found some sites that report on this issue:

到目前为止,我尝试遵循他们建议的一些解决方案,但没有成功.accloss 仍然与 fit_generator()evaluate_generator() 不同,即使使用完全相同的数据使用相同的生成器生成用于训练和验证.这是我尝试过的:

I tried following some of their suggested solutions without success so far. acc and loss are still different from fit_generator() and evaluate_generator(), even when using the exact same data generated with the same generator for training and validation. Here is what I tried:

  • 静态设置整个脚本的 learning_phase 或在向预训练的层添加新层之前
    K.set_learning_phase(0) # testing  
    K.set_learning_phase(1) # training

  • 从预训练模型中解冻所有批量归一化层
  •     for i in range(len(model.layers)):
            if str.startswith(model.layers[i].name, 'bn'):
                model.layers[i].trainable=True
    

    • 不将 dropout 或批量归一化添加为未经训练的层
    •     # Create pre-trained base model
          basemodel = ResNet50(include_top=False,                     # exclude final pooling and fully connected layer in the original model
                               weights='imagenet',                    # pre-training on ImageNet
                               input_tensor=None,                     # optional tensor to use as image input for the model
                               input_shape=(imagesize,                # shape tuple
                                            imagesize,
                                            nBands),
                               pooling=None,                          # output of the model will be the 4D tensor output of the last convolutional layer
                               classes=nClasses)                      # number of classes to classify images into
      
          # Create new untrained layers
          x = basemodel.output
          x = GlobalAveragePooling2D()(x)                             # global spatial average pooling layer
          x = Dense(1024, activation='relu')(x)                       # fully-connected layer
          y = Dense(nClasses, activation='softmax')(x)                # logistic layer making sure that probabilities sum up to 1
      
          # Create model combining pre-trained base model and new untrained layers
          model = Model(inputs=basemodel.input,
                        outputs=y)
      
          # Freeze weights on pre-trained layers
          for layer in basemodel.layers:
              layer.trainable = False
      
          # Define learning optimizer
          learningRate = 0.01
          optimizerSGD = optimizers.SGD(lr=learningRate,              # learning rate.
                                        momentum=0.9,                 # parameter that accelerates SGD in the relevant direction and dampens oscillations
                                        decay=learningRate/nEpochs,   # learning rate decay over each update
                                        nesterov=True)                # whether to apply Nesterov momentum
          # Compile model
          model.compile(optimizer=optimizerSGD,                       # stochastic gradient descent optimizer
                        loss='categorical_crossentropy',              # objective function
                        metrics=['accuracy'],                         # metrics to be evaluated by the model during training and testing
                        loss_weights=None,                            # scalar coefficients to weight the loss contributions of different model outputs
                        sample_weight_mode=None,                      # sample-wise weights
                        weighted_metrics=None,                        # metrics to be evaluated and weighted by sample_weight or class_weight during training and testing
                        target_tensors=None)                          # tensor model's target, which will be fed with the target data during training
      

      • 使用不同的预训练 CNN 作为基础模型(VGG19、InceptionV3、InceptionResNetV2、Xception)
      •     from keras.applications.vgg19 import VGG19
        
            basemodel = VGG19(include_top=False,                        # exclude final pooling and fully connected layer in the original model
                                 weights='imagenet',                    # pre-training on ImageNet
                                 input_tensor=None,                     # optional tensor to use as image input for the model
                                 input_shape=(imagesize,                # shape tuple
                                              imagesize,
                                              nBands),
                                 pooling=None,                          # output of the model will be the 4D tensor output of the last convolutional layer
                                 classes=nClasses)                      # number of classes to classify images into
        

        如果我缺少其他解决方案,请告诉我.

        Please let me know if there are other solutions around that I am missing.

        推荐答案

        我现在设法拥有相同的评估指标.我更改了以下内容:

        I now managed having the same evaluation metrics. I changed the following:

        • 我按照@Anakin 的建议在 flow_from_directory() 中设置了 seed
        • I set seed in flow_from_directory() as suggested by @Anakin
        def generate_data(path, imagesize, nBatches):
                datagen = ImageDataGenerator(rescale=1./255)
                generator = datagen.flow_from_directory(directory=path,     # path to the target directory
                     target_size=(imagesize,imagesize),                     # dimensions to which all images found will be resize
                     color_mode='rgb',                                      # whether the images will be converted to have 1, 3, or 4 channels
                     classes=None,                                          # optional list of class subdirectories
                     class_mode='categorical',                              # type of label arrays that are returned
                     batch_size=nBatches,                                   # size of the batches of data
                     shuffle=True,                                          # whether to shuffle the data
                     seed=42)                                               # random seed for shuffling and transformations
                return generator
        

        <小时>

        • 我根据警告在 fit_generator() 中设置了 use_multiprocessing=False:use_multiprocessing=True 并且多个工作人员可能会复制您的数据
        • history = model.fit_generator(generator=trainGenerator,
                                            steps_per_epoch=trainGenerator.samples//nBatches,     # total number of steps (batches of samples)
                                            epochs=nEpochs,                   # number of epochs to train the model
                                            verbose=2,                        # verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch
                                            callbacks=callback,               # keras.callbacks.Callback instances to apply during training
                                            validation_data=valGenerator,     # generator or tuple on which to evaluate the loss and any model metrics at the end of each epoch
                                            validation_steps=
                                            valGenerator.samples//nBatches,   # number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch
                                            class_weight=None,                # optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function
                                            max_queue_size=10,                # maximum size for the generator queue
                                            workers=1,                        # maximum number of processes to spin up when using process-based threading
                                            use_multiprocessing=False,        # whether to use process-based threading
                                            shuffle=False,                    # whether to shuffle the order of the batches at the beginning of each epoch
                                            initial_epoch=0)                  # epoch at which to start training
          

          <小时>

          • 我按照 keras 文档,关于如何在开发过程中使用 Keras 获得可重现的结果
          • import tensorflow as tf
            import random as rn
            from keras import backend as K
            
            np.random.seed(42)
            rn.seed(12345)
            session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,
                                          inter_op_parallelism_threads=1)
            tf.set_random_seed(1234)
            sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
            K.set_session(sess)
            

            <小时>

            • 我现在不再使用 datagen = ImageDataGenerator(rescale=1./255) 重新缩放输入图像,而是使用以下方法生成数据:

              • Instead of rescaling input images with datagen = ImageDataGenerator(rescale=1./255), I now generate my data with:
              • from keras.applications.resnet50 import preprocess_input
                datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
                

                <小时>

                有了这个,我设法从 fit_generator()evaluate_generator() 获得了相似的准确性和损失.此外,使用相同的数据进行训练和测试现在会产生类似的指标.存在差异的原因keras 文档.


                With this, I managed to have a similar accuracy and loss from fit_generator() and evaluate_generator(). Also, using the same data for training and testing now results in a similar metrics. Reasons for remaining differences are provided in the keras documentation.

                这篇关于为什么 fit_generator 的准确性与 Keras 中的evaluate_generator 的准确性不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆