在Keras中使用VGG16进行MNIST和转移学习-验证准确性低 [英] MNIST and transfer learning with VGG16 in Keras- low validation accuracy

查看:87
本文介绍了在Keras中使用VGG16进行MNIST和转移学习-验证准确性低的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始为项目使用Keras的flow_from_dataframe()功能,并决定使用MNIST数据集对其进行测试.我有一个目录,里面充满了png格式的MNIST样本,还有一个数据框,其中一列的内容是绝对目录,另一列的内容是标签.

我还使用转移学习,导入VGG16作为基础,并在10的softmax层之前添加我自己的512节点relu密集层和0.5退出(对于数字0-9).我正在使用rmsprop(lr = 1e-4)作为优化程序.

启动环境时,它会从Git调用最新版本的keras_preprocessing,它支持绝对目录和大写的文件扩展名.

我的问题是我的训练准确性很高,而验证准确性却非常低.到最后一个时期(10),我的训练准确度为0.94,验证准确度为0.01.

我想知道我的脚本是否存在根本性的错误?对于另一个数据集,在第4个时期之后,我什至都获得了训练和验证损失值的NaN.(我检查了相关列,没有任何空值!)

这是我的代码.我将不胜感激,如果有人可以浏览一下,看看是否有任何东西突然出现在他们身上.

 将pandas导入为pd将numpy导入为np进口喀拉拉邦从keras_preprocessing.image导入ImageDataGenerator从keras导入应用程序来自keras导入优化器从keras.models导入模型从keras.layers导入Dropout,Flatten,Dense,GlobalAveragePooling2D从keras导入后端为k从keras.callbacks导入ModelCheckpoint,CSVLogger从keras.applications.vgg16导入VGG16,preprocess_input#初始化模型img_width,img_height = 32,32型号= VGG16(权重='imagenet',include_top = False,input_shape =(img_width,img_height,3))#冻结所有图层用于model.layers中的图层:layer.trainable = False#添加自定义图层x = model.outputx = Flatten()(x)x =密集(512,激活='relu')[x)x =辍学(0.5)(x)预测=密集(10,激活="softmax")(x)#创建最终模型model_final =模型(输入=模型.输入,输出=预测)#编译模型均方根=优化器.RMSprop(lr = 1e-4)#adadelta =优化器.Adadelta(lr = 0.001,rho = 0.5,epsilon = None,衰减= 0.0)model_final.compile(loss ="categorical_crossentropy",优化程序= rms,metrics = ["accuracy"])#加载和定义源数据火车= pd.read_csv('MNIST_train.csv',index_col = 0)val = pd.read_csv('MNIST_test.csv',index_col = 0)nb_train_samples = 60000nb_validation_samples = 10000batch_size = 60纪元= 10#启动火车并测试发电机train_datagen = ImageDataGenerator()test_datagen = ImageDataGenerator()train_generator = train_datagen.flow_from_dataframe(dataframe = train,目录=无,x_col ='train_samples',y_col ='train_labels',has_ext = True,target_size =(img_height,img_width),batch_size = batch_size,class_mode ='类别',color_mode ='rgb')validate_generator = test_datagen.flow_from_dataframe(dataframe = val,目录=无,x_col ='test_samples',y_col ='test_labels',has_ext = True,target_size =(img_height,img_width),batch_size = batch_size,class_mode ='类别',color_mode ='rgb')#获取分类指数打印('****************')对于cls,id在train_generator.class_indices.items()中:print('Class#{} = {}'.format(idx,cls))打印('****************')#定义回调路径='./chk/epoch_{epoch:02d}-valLoss_{val_loss:.2f}-valAcc_{val_acc:.2f}.hdf5'chk = ModelCheckpoint(路径,监视器='val_acc',详细= 1,save_best_only = True,mode ='max')logger = CSVLogger('./chk/training_log.csv',分隔符=',',append = False)nPlus = 1samples_per_epoch = nb_train_samples * nPlus#训练模型model_final.fit_generator(train_generator,steps_per_epoch = int(samples_per_epoch/batch_size),纪元=纪元validation_data = validation_generator,validation_steps = int(nb_validation_samples/batch_size),回调= [chk,记录器]) 

解决方案

您是否尝试过明确定义图像的类?这样:

  train_generator = image.ImageDataGenerator().flow_from_dataframe(classes = [0,1,2,3,4,5,6,7,8,9]) 

训练生成器和验证生成器中的

.

我发现有时候训练和验证生成器会创建不同的对应字典.

I recently started taking advantage of Keras's flow_from_dataframe() feature for a project, and decided to test it with the MNIST dataset. I have a directory full of the MNIST samples in png format, and a dataframe with the absolute directory for each in one column and the label in the other.

I'm also using transfer learning, importing VGG16 as a base, and adding my own 512 node relu dense layer and 0.5 drop-out before a softmax layer of 10. (For digits 0-9). I'm using rmsprop (lr=1e-4) as the optimizer.

When I launch my environment, it calls the latest version of keras_preprocessing from Git, which has support for absolute directories and capitalized file extensions.

My problem is that I have a very high training accuracy, and a terribly low validation accuracy. By my final epoch (10), I had a training accuracy of 0.94 and a validation accuracy of 0.01.

I'm wondering if there's something fundamentally wrong with my script? With another dataset, I'm even getting NaNs for both my training and validation loss values after epoch 4. (I checked the relevant columns, there aren't any null values!)

Here's my code. I'd be deeply appreciative is someone could glance through it and see if anything jumped out at them.

import pandas as pd
import numpy as np

import keras
from keras_preprocessing.image import ImageDataGenerator

from keras import applications
from keras import optimizers
from keras.models import Model 
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import backend as k 
from keras.callbacks import ModelCheckpoint, CSVLogger

from keras.applications.vgg16 import VGG16, preprocess_input

# INITIALIZE MODEL

img_width, img_height = 32, 32
model = VGG16(weights = 'imagenet', include_top=False, input_shape = (img_width, img_height, 3))

# freeze all layers
for layer in model.layers:
    layer.trainable = False

# Adding custom Layers 
x = model.output
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(10, activation="softmax")(x)

# creating the final model 
model_final = Model(input = model.input, output = predictions)

# compile the model 
rms = optimizers.RMSprop(lr=1e-4)
#adadelta = optimizers.Adadelta(lr=0.001, rho=0.5, epsilon=None, decay=0.0)

model_final.compile(loss = "categorical_crossentropy", optimizer = rms, metrics=["accuracy"])

# LOAD AND DEFINE SOURCE DATA

train = pd.read_csv('MNIST_train.csv', index_col=0)
val = pd.read_csv('MNIST_test.csv', index_col=0)

nb_train_samples = 60000
nb_validation_samples = 10000
batch_size = 60
epochs = 10

# Initiate the train and test generators
train_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()

train_generator = train_datagen.flow_from_dataframe(dataframe=train,
                                                    directory=None,
                                                    x_col='train_samples',
                                                    y_col='train_labels',
                                                    has_ext=True,
                                                    target_size = (img_height,
                                                                   img_width),
                                                    batch_size = batch_size, 
                                                    class_mode = 'categorical',
                                                    color_mode = 'rgb')

validation_generator = test_datagen.flow_from_dataframe(dataframe=val,
                                                        directory=None,
                                                        x_col='test_samples',
                                                        y_col='test_labels',
                                                        has_ext=True,
                                                        target_size = (img_height, 
                                                                       img_width),
                                                        batch_size = batch_size, 
                                                        class_mode = 'categorical',
                                                        color_mode = 'rgb')

# GET CLASS INDICES
print('****************')
for cls, idx in train_generator.class_indices.items():
    print('Class #{} = {}'.format(idx, cls))
print('****************')

# DEFINE CALLBACKS

path = './chk/epoch_{epoch:02d}-valLoss_{val_loss:.2f}-valAcc_{val_acc:.2f}.hdf5'

chk = ModelCheckpoint(path, monitor = 'val_acc', verbose = 1, save_best_only = True, mode = 'max')

logger = CSVLogger('./chk/training_log.csv', separator = ',', append=False)

nPlus = 1
samples_per_epoch = nb_train_samples * nPlus

# Train the model 
model_final.fit_generator(train_generator,
                          steps_per_epoch = int(samples_per_epoch/batch_size),
                          epochs = epochs,
                          validation_data = validation_generator,
                          validation_steps = int(nb_validation_samples/batch_size),
                          callbacks = [chk, logger])

解决方案

Have you tried explicitly defining the classes of the images? as such:

train_generator=image.ImageDataGenerator().flow_from_dataframe(classes=[0,1,2,3,4,5,6,7,8,9])

in both the train and validation generators.

I have found that sometimes the train and validation generators create different correspondence dictionaries.

这篇关于在Keras中使用VGG16进行MNIST和转移学习-验证准确性低的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆