使用VGGFace权重微调VGG模型 [英] Finetuning VGG model with VGGFace weights

查看:309
本文介绍了使用VGGFace权重微调VGG模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用经过微调的VGG16模型,该模型使用预训练的"VGGFace"权重来处理狂野中的带标签的面孔"(LFW数据集).问题是,在训练了一个纪元(大约0.0037%)之后,我得到的准确性非常低,即该模型根本没有学习.

I am using a finetuned VGG16 model using the pretrained 'VGGFace' weights to work on Labelled Faces In the Wild (LFW dataset). The problem is that I get a very low accuracy, after training for an epoch (around 0.0037%), i.e., the model isn't learning at all.

我认为它必须与我的体系结构有关.我的架构是这样的:

I think it has got to do something with my architecture. My architecture is like this:

vgg_x = VGGFace(model = 'vgg16', weights = 'vggface', input_shape = (224,224,3), include_top = False)
last_layer = vgg_x.get_layer('pool5').output
x = Flatten(name='flatten')(last_layer)
x = Dense(4096, activation='relu', name='fc6')(x)

out = Dense(311, activation='softmax', name='fc8')(x)
custom_vgg_model = Model(vgg_x.input, out)

custom_vgg_model.compile(optimizer = keras.optimizers.Adam(), loss = 
keras.losses.categorical_crossentropy, metrics = ['accuracy'])

kfold = KFold(n_splits = 15,random_state = 42)
kf = kfold.get_n_splits(X_train)

for train_index,test_index in kfold.split(X_train):
    X_cross_train = X_train[train_index]
    X_cross_test = X_train[test_index]
    Y_cross_train = y_train[train_index]
    Y_cross_test = y_train[test_index]
    custom_vgg_model.fit(x = X_cross_train,y = Y_cross_train, batch_size = 32, epochs = 10,verbose = 2, validation_data = (X_cross_test,Y_cross_test))

我希望该模型至少能学到一点点准确性.可能是什么问题呢 ?我的体系结构或其他任何问题吗?

I expect the model to learn atleast if not get a great accuracy. What could be the problem ? Is there something wrong with my architecture or anything else ?

预处理步骤应该没错,但以防万一:

Preprocessing step shouldn't be wrong, but just in case:

image_set_x = keras_vggface.utils.preprocess_input(image_set_x, version=1)

推荐答案

尝试以比默认值(例如1e-4)小的学习率进行训练.来自分类层的随机权重可以带来较大的梯度更新.这些将在下层引起较大的权重更新,并基本上破坏卷积基数中的预训练权重.

Try training with a smaller learning rate than the default one (for instance, 1e-4). The random weights from the classification layer can bring about large gradient updates. These will cause large weight updates in the lower layers and basically destroy the pretrained weights in the convolutional base.

此外,您可以使用 ReduceLROnPlateau 回调来进一步降低验证准确性时的学习率停止增加.

In addition, you can use the ReduceLROnPlateau callback to further decrease the learning rate when validation accuracy stops increasing.

另一种避免较大破坏性梯度更新的策略是先冻结卷积基数中的权重,对分类层进行预训练,然后以较小的学习率微调整个堆栈.在有关迁移学习的Keras博客文章中详细说明了这种方法: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

Another strategy to avoid large disruptive gradient updates is to freeze the weights in the convolutional base first, pre-train the classification layers, then finetune the entire stack with a small learning rate. This approach is explained in detail in the Keras blogpost on transfer learning: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

这篇关于使用VGGFace权重微调VGG模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆