为什么fit_generator的精度与Keras中的valuate_generator的精度不同? [英] Why is accuracy from fit_generator different to that from evaluate_generator in Keras?

查看：165 发布时间：2020/4/25 9:46:17 python tensorflow machine-learning keras conv-neural-network

本文介绍了为什么fit_generator的精度与Keras中的valuate_generator的精度不同?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我做什么:

我正在用Keras fit_generator()训练预训练的CNN.这会在每个时期后产生评估指标(loss, acc, val_loss, val_acc).训练模型后，我用evaluate_generator()生成评估指标(loss, acc).

I am training a pre-trained CNN with Keras fit_generator(). This produces evaluation metrics (loss, acc, val_loss, val_acc) after each epoch. After training the model, I produce evaluation metrics (loss, acc) with evaluate_generator().

我期望的是

如果我训练模型一个纪元，我希望通过fit_generator()和evaluate_generator()获得的度量是相同的.他们俩都应基于整个数据集得出指标.

If I train the model for one epoch, I would expect that the metrics obtained with fit_generator() and evaluate_generator() are the same. They both should derive the metrics based on the entire dataset.

我的观察结果

loss和acc与fit_generator()和evaluate_generator()不同:

Both loss and acc are different from fit_generator() and evaluate_generator():

我不了解的地方:

为什么fit_generator()中的精度为与evaluate_generator()

Why the accuracy from fit_generator() is different to that from evaluate_generator()

我的代码:

def generate_data(path, imagesize, nBatches):
    datagen = ImageDataGenerator(rescale=1./255)
    generator = datagen.flow_from_directory\
        (directory=path,                                        # path to the target directory
         target_size=(imagesize,imagesize),                     # dimensions to which all images found will be resize
         color_mode='rgb',                                      # whether the images will be converted to have 1, 3, or 4 channels
         classes=None,                                          # optional list of class subdirectories
         class_mode='categorical',                              # type of label arrays that are returned
         batch_size=nBatches,                                   # size of the batches of data
         shuffle=True)                                          # whether to shuffle the data
    return generator

[...]

def train_model(model, nBatches, nEpochs, trainGenerator, valGenerator, resultPath):
    history = model.fit_generator(generator=trainGenerator,
                                  steps_per_epoch=trainGenerator.samples//nBatches,     # total number of steps (batches of samples)
                                  epochs=nEpochs,                   # number of epochs to train the model
                                  verbose=2,                        # verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch
                                  callbacks=None,                   # keras.callbacks.Callback instances to apply during training
                                  validation_data=valGenerator,     # generator or tuple on which to evaluate the loss and any model metrics at the end of each epoch
                                  validation_steps=
                                  valGenerator.samples//nBatches,   # number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch
                                  class_weight=None,                # optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function
                                  max_queue_size=10,                # maximum size for the generator queue
                                  workers=32,                       # maximum number of processes to spin up when using process-based threading
                                  use_multiprocessing=True,         # whether to use process-based threading
                                  shuffle=False,                     # whether to shuffle the order of the batches at the beginning of each epoch
                                  initial_epoch=0)                  # epoch at which to start training
    print("%s: Model trained." % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))

    # Save model
    modelPath = os.path.join(resultPath, datetime.now().strftime('%Y-%m-%d_%H-%M-%S') + '_modelArchitecture.h5')
    weightsPath = os.path.join(resultPath, datetime.now().strftime('%Y-%m-%d_%H-%M-%S') + '_modelWeights.h5')
    model.save(modelPath)
    model.save_weights(weightsPath)
    print("%s: Model saved." % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    return history, model

[...]

def evaluate_model(model, generator):
    score = model.evaluate_generator(generator=generator,           # Generator yielding tuples
                                     steps=
                                     generator.samples//nBatches)   # number of steps (batches of samples) to yield from generator before stopping

    print("%s: Model evaluated:"
          "\n\t\t\t\t\t\t Loss: %.3f"
          "\n\t\t\t\t\t\t Accuracy: %.3f" %
          (datetime.now().strftime('%Y-%m-%d_%H-%M-%S'),
           score[0], score[1]))

[...]

def main():
    # Create model
    modelUntrained = create_model(imagesize, nBands, nClasses)

    # Prepare training and validation data
    trainGenerator = generate_data(imagePathTraining, imagesize, nBatches)
    valGenerator = generate_data(imagePathValidation, imagesize, nBatches)

    # Train and save model
    history, modelTrained = train_model(modelUntrained, nBatches, nEpochs, trainGenerator, valGenerator, resultPath)

    # Evaluate on validation data
    print("%s: Model evaluation (valX, valY):" % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    evaluate_model(modelTrained, valGenerator)

    # Evaluate on training data
    print("%s: Model evaluation (trainX, trainY):" % datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))
    evaluate_model(modelTrained, trainGenerator)

更新

我发现了一些报告此问题的网站:

I found some sites that report on this issue:

The Batch Normalization layer of Keras is broken
Strange behaviour of the loss function in keras model, with pretrained convolutional base
model.evaluate() gives a different loss on training data from the one in training process
Got different accuracy between history and evaluate
ResNet: 100% accuracy during training, but 33% prediction accuracy with the same data

到目前为止，我一直尝试遵循他们提出的一些解决方案，但没有成功. acc和loss仍然与fit_generator()和evaluate_generator()有所不同，即使使用由同一生成器生成的完全相同的数据进行训练和验证也是如此.这是我尝试过的:

I tried following some of their suggested solutions without success so far. acc and loss are still different from fit_generator() and evaluate_generator(), even when using the exact same data generated with the same generator for training and validation. Here is what I tried:

为整个脚本静态地设置learning_phase，或者在将新层添加到预训练的层之前

    K.set_learning_phase(0) # testing  
    K.set_learning_phase(1) # training

从预训练模型中解冻所有批次归一化层

    for i in range(len(model.layers)):
        if str.startswith(model.layers[i].name, 'bn'):
            model.layers[i].trainable=True

不将辍学或批处理规范化添加为未训练的层

    # Create pre-trained base model
    basemodel = ResNet50(include_top=False,                     # exclude final pooling and fully connected layer in the original model
                         weights='imagenet',                    # pre-training on ImageNet
                         input_tensor=None,                     # optional tensor to use as image input for the model
                         input_shape=(imagesize,                # shape tuple
                                      imagesize,
                                      nBands),
                         pooling=None,                          # output of the model will be the 4D tensor output of the last convolutional layer
                         classes=nClasses)                      # number of classes to classify images into

    # Create new untrained layers
    x = basemodel.output
    x = GlobalAveragePooling2D()(x)                             # global spatial average pooling layer
    x = Dense(1024, activation='relu')(x)                       # fully-connected layer
    y = Dense(nClasses, activation='softmax')(x)                # logistic layer making sure that probabilities sum up to 1

    # Create model combining pre-trained base model and new untrained layers
    model = Model(inputs=basemodel.input,
                  outputs=y)

    # Freeze weights on pre-trained layers
    for layer in basemodel.layers:
        layer.trainable = False

    # Define learning optimizer
    learningRate = 0.01
    optimizerSGD = optimizers.SGD(lr=learningRate,              # learning rate.
                                  momentum=0.9,                 # parameter that accelerates SGD in the relevant direction and dampens oscillations
                                  decay=learningRate/nEpochs,   # learning rate decay over each update
                                  nesterov=True)                # whether to apply Nesterov momentum
    # Compile model
    model.compile(optimizer=optimizerSGD,                       # stochastic gradient descent optimizer
                  loss='categorical_crossentropy',              # objective function
                  metrics=['accuracy'],                         # metrics to be evaluated by the model during training and testing
                  loss_weights=None,                            # scalar coefficients to weight the loss contributions of different model outputs
                  sample_weight_mode=None,                      # sample-wise weights
                  weighted_metrics=None,                        # metrics to be evaluated and weighted by sample_weight or class_weight during training and testing
                  target_tensors=None)                          # tensor model's target, which will be fed with the target data during training

使用不同的预训练CNN作为基本模型( VGG19，InceptionV3，InceptionResNetV2，Xception )

using different pre-trained CNNs as base model (VGG19, InceptionV3, InceptionResNetV2, Xception)

    from keras.applications.vgg19 import VGG19

    basemodel = VGG19(include_top=False,                        # exclude final pooling and fully connected layer in the original model
                         weights='imagenet',                    # pre-training on ImageNet
                         input_tensor=None,                     # optional tensor to use as image input for the model
                         input_shape=(imagesize,                # shape tuple
                                      imagesize,
                                      nBands),
                         pooling=None,                          # output of the model will be the 4D tensor output of the last convolutional layer
                         classes=nClasses)                      # number of classes to classify images into

请让我知道我是否还缺少其他解决方案.

Please let me know if there are other solutions around that I am missing.

推荐答案

我现在设法拥有相同的评估指标.我更改了以下内容:

I now managed having the same evaluation metrics. I changed the following:

我按照@Anakin的建议在flow_from_directory()中设置了seed.

def generate_data(path, imagesize, nBatches):
        datagen = ImageDataGenerator(rescale=1./255)
        generator = datagen.flow_from_directory(directory=path,     # path to the target directory
             target_size=(imagesize,imagesize),                     # dimensions to which all images found will be resize
             color_mode='rgb',                                      # whether the images will be converted to have 1, 3, or 4 channels
             classes=None,                                          # optional list of class subdirectories
             class_mode='categorical',                              # type of label arrays that are returned
             batch_size=nBatches,                                   # size of the batches of data
             shuffle=True,                                          # whether to shuffle the data
             seed=42)                                               # random seed for shuffling and transformations
        return generator

我根据警告设置了fit_generator()中的use_multiprocessing=False: use_multiprocessing=True and multiple workers may duplicate your data

I set use_multiprocessing=False in fit_generator() according to the warning: use_multiprocessing=True and multiple workers may duplicate your data

history = model.fit_generator(generator=trainGenerator,
                                  steps_per_epoch=trainGenerator.samples//nBatches,     # total number of steps (batches of samples)
                                  epochs=nEpochs,                   # number of epochs to train the model
                                  verbose=2,                        # verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch
                                  callbacks=callback,               # keras.callbacks.Callback instances to apply during training
                                  validation_data=valGenerator,     # generator or tuple on which to evaluate the loss and any model metrics at the end of each epoch
                                  validation_steps=
                                  valGenerator.samples//nBatches,   # number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch
                                  class_weight=None,                # optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function
                                  max_queue_size=10,                # maximum size for the generator queue
                                  workers=1,                        # maximum number of processes to spin up when using process-based threading
                                  use_multiprocessing=False,        # whether to use process-based threading
                                  shuffle=False,                    # whether to shuffle the order of the batches at the beginning of each epoch
                                  initial_epoch=0)                  # epoch at which to start training

我按照 keras文档关于如何在开发过程中使用Keras获得可再现的结果

I unified my python setup as suggested in the keras documentation on how to obtain reproducible results using Keras during development

import tensorflow as tf
import random as rn
from keras import backend as K

np.random.seed(42)
rn.seed(12345)
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,
                              inter_op_parallelism_threads=1)
tf.set_random_seed(1234)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

我现在不再使用datagen = ImageDataGenerator(rescale=1./255)重新缩放输入图像，而是使用以下命令生成数据:

Instead of rescaling input images with datagen = ImageDataGenerator(rescale=1./255), I now generate my data with:

from keras.applications.resnet50 import preprocess_input
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

有了这个，我从fit_generator()和evaluate_generator()设法达到了相似的准确性和损失.而且，使用相同的数据进行培训和测试现在会产生相似的指标.

With this, I managed to have a similar accuracy and loss from fit_generator() and evaluate_generator(). Also, using the same data for training and testing now results in a similar metrics. Reasons for remaining differences are provided in the keras documentation.

这篇关于为什么fit_generator的精度与Keras中的valuate_generator的精度不同?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么fit_generator的精度与Keras中的valuate_generator的精度不同? [英] Why is accuracy from fit_generator different to that from evaluate_generator in Keras?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

为什么fit_generator的精度与Keras中的valuate_generator的精度不同? [英] Why is accuracy from fit_generator different to that from evaluate_generator in Keras?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭