如何在没有ImageNet权重的情况下进行转移学习? [英] How to do Transfer Learning without ImageNet weights?

查看:61
本文介绍了如何在没有ImageNet权重的情况下进行转移学习?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的项目的描述:

数据集1::较大的数据集包含图像的二进制类别.

Dataset1: The bigger dataset, contains binary classes of images.

Dataset2 :包含外观与 Dataset1 非常相似的 2 类.我想建立一个通过学习 Dataset1 使用转移学习的模型,并在 Dataset2 中应用学习率较低的权重.

Dataset2: Contains 2 classes that are very similar in appearance to Dataset1. I want to make a model that is using transfer learning by learning from Dataset1 and apply the weights with less learning rate in Dataset2.

因此,我希望在 dataset1 上训练整个 VGG16 ,然后使用转移学习对 dataset2 的最后一层进行微调.我不想使用预先训练的imagenet数据库.这是我正在使用的代码,我从中保存了很多东西:

Therefore I’m looking to train the entire VGG16 on dataset1, then using transfer learning to finetune the last layers for dataset2. I do not want to use the pre-trained imagenet database. This is the code I am using, and I have saved the wights from it:


from tensorflow.keras.layers import Input, Lambda, Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
import numpy as np
from glob import glob
import matplotlib.pyplot as plt

vgg = VGG16(input_shape=(244, 244, 3), weights=None, include_top=False)

# don't train existing weights
for layer in vgg.layers:
    layer.trainable = False
    
x = Flatten()(vgg.output)   

import tensorflow.keras
prediction = tensorflow.keras.layers.Dense(2, activation='softmax')(x)

model = Model(inputs=vgg.input, outputs=prediction)

model.compile(
  loss='categorical_crossentropy',
  optimizer='adam',
  metrics=['accuracy']
)

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('chest_xray/train',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

test_set = train_datagen.flow_from_directory('chest_xray/test',
                                                 target_size = (224, 224),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

# fit the model
r = model.fit_generator(
  training_set,
  validation_data=test_set,
  epochs=5,
  steps_per_epoch=len(training_set),
  validation_steps=len(test_set)
)

model.save_weights('first_try.h5') 

推荐答案

更新

根据您的查询,数据集2 中的类号似乎没有什么不同.同时,您也不想使用图像净重.因此,在这种情况下,您无需映射或存储权重(如下所述).只需加载模型和权重并在 Dataset2 上进行训练即可.从 Dataset1 中冻结所有受过训练的图层,并在 Dataset2 上训练最后一层;真的很简单.

Update

Based on your query, it seems that the class number won't be different in Dataset2. At the same time, you also don't want to use image net weight. So, in that case, you don't need to map or store the weight (as described below). Just load the model and weight and train on Dataset2. Freeze the all trained layer from Dataset1 and train the last layer on Dataset2; really straight forward.

在我的以下答复中,尽管您不需要完整的信息,但我还是保留此信息以备将来参考.

In my below response, though you're not needed the full information, I am keeping that anyway for future reference.


这是您可能需要的小展示.希望它能给您一些见识.在这里,我们将训练具有 10 类的 CIRFAR 数据集,并尝试将其用于具有不同输入大小和不同数量的不同数据集的传输学习.课.


Here is a small demonstration of what you probably need. Hope it gives you some insight. Here we will train the CIRFAR data set which has 10 classes and try to use it for transfer learning with on different data set which probably has different input sizes and a different number of classes.

import numpy as np
import tensorflow as tf 
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# train set / data 
x_train = x_train.astype('float32') / 255

# validation set / data 
x_test = x_test.astype('float32') / 255

# train set / target 
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
# validation set / target 
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)

print(x_train.shape, y_train.shape) 
print(x_test.shape, y_test.shape)  
'''
(50000, 32, 32, 3) (50000, 10)
(10000, 32, 32, 3) (10000, 10)
'''

型号

# declare input shape 
input = tf.keras.Input(shape=(32,32,3))
# Block 1
x = tf.keras.layers.Conv2D(32, 3, strides=2, activation="relu")(input)
x = tf.keras.layers.MaxPooling2D(3)(x)

# Now that we apply global max pooling.
gap = tf.keras.layers.GlobalMaxPooling2D()(x)

# Finally, we add a classification layer.
output = tf.keras.layers.Dense(10, activation='softmax')(gap)

# bind all
func_model = tf.keras.Model(input, output)

'''
Model: "functional_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 15, 15, 32)        896       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 32)          0         
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                330       
=================================================================
Total params: 1,226
Trainable params: 1,226
Non-trainable params: 0
'''

运行模型以获取一些权重矩阵,如下所示:

Run the model to get some weight matrices as follows:

# compile 
print('\nFunctional API')
func_model.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
func_model.fit(x_train, y_train, batch_size=128, epochs=1)

转移学习

让我们将其用于 MNIST .它还具有 10 个类,但是出于需要不同数量的类的原因,我们将根据其创建 even odd 类(2 类).下面我们将如何准备这些数据集

Transfer Learning

Let's use it for MNIST. It also has 10 classes but for sake of need a different number of classes, we will make even and odd categories from it (2 classes). Below how we will prepare these data sets

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# train set / data 
x_train = np.expand_dims(x_train, axis=-1)
x_train = np.repeat(x_train, 3, axis=-1)
x_train = x_train.astype('float32') / 255
# train set / target 
y_train = tf.keras.utils.to_categorical((y_train % 2 == 0).astype(int), 
                                        num_classes=2)

# validation set / data 
x_test = np.expand_dims(x_test, axis=-1)
x_test = np.repeat(x_test, 3, axis=-1)
x_test = x_test.astype('float32') / 255
# validation set / target 

y_test = tf.keras.utils.to_categorical((y_test % 2 == 0).astype(int), 
                                       num_classes=2)

print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)  
'''
(60000, 28, 28, 3) (60000, 2)
(10000, 28, 28, 3) (10000, 2)
'''

如果您熟悉 keras 模型中ImageNet预训练重量的用法,则可以使用 include_top .通过将其设置为 False ,我们可以轻松加载没有预训练模型顶级信息的重量文件.所以在这里,我们需要手动(有点)做到这一点.我们需要获取权重矩阵,直到最后一个激活层(在本例中为 Dense(10,softmax)).并将其放入基础模型的新实例中,然后+添加一个新的分类器层(在本例中为 Dense(2,softmax).

If you're familiar with the usage of ImageNet pretrained weight in the keras model, you probably use include_top. By setting it False we can easily load a weight file that has no top information of the pretrained models. So here we need to manually (kinda) do that. We need to grab the weight matrices until the last activation layer (in our case which is Dense(10, softmax)). And put it in the new instance of the base model, and + we add a new classifier layer (in our case that will be Dense(2, softmax).

for i, layer in enumerate(func_model.layers):
    print(i,'\t',layer.trainable,'\t  :',layer.name)

'''
  Train_Bool  : Layer Names
0    True     : input_1
1    True     : conv2d
2    True     : max_pooling2d
3    True     : global_max_pooling2d # < we go till here to grab the weight and biases
4    True     : dense  # 10 classes (from previous model)
'''

获取体重

sparsified_weights = []
for w in func_model.get_layer(name='global_max_pooling2d').get_weights():
    sparsified_weights.append(w)

通过这种方式,我们映射了旧模型中的权重(分类器层除外)( Dense ).请注意,这里我们要权衡直到分类器之前的 GAP 层.

By that, we map the weight from the old model except for the classifier layers (Dense). Please note, here we grab the weight until the GAP layer, which is there right before the classifier.

现在,我们将创建一个与旧模型相同的新模型,除了最后一层( 10 Dense )外,同时添加一个新的 Dense 使用 2 单位.

Now, we will create a new model, the same as the old model except for the last layer (10 Dense), and at the same time add a new Dense with 2 unit.

predictions    = Dense(2, activation='softmax')(func_model.layers[-2].output)
new_func_model = Model(inputs=func_model.inputs, outputs = predictions) 

现在我们可以为新模型设置权重如下:

And now we can set weight as follows to the new model:

new_func_model.get_layer(name='global_max_pooling2d').set_weights(sparsified_weights)

您可以检查以进行以下验证:除了最后一层,其他所有内容都将相同.

You can check to verify as follows; all will be the same except the last layer.

func_model.get_weights()      # last layer, Dense (10)
new_func_model.get_weights()  # last layer, Dense (2)

现在您可以使用新数据集训练模型,在本例中为 MNIST

Now you can train the model with new data set, in our case which was MNIST

new_func_model.compile(optimizer='adam', loss='categorical_crossentropy')
new_func_model.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 15, 15, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 5, 5, 32)          0         
_________________________________________________________________
global_max_pooling2d (Global (None, 32)                0         
_________________________________________________________________
dense_6 (Dense)              (None, 2)                 66        
=================================================================
Total params: 962
Trainable params: 962
Non-trainable params: 0
'''

# compile 
print('\nFunctional API')
new_func_model.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
new_func_model.fit(x_train, y_train, batch_size=128, epochs=1)

WARNING:tensorflow:Model was constructed with shape (None, 32, 32, 3) for input Tensor("input_1:0", shape=(None, 32, 32, 3), dtype=float32), but it was called on an input with incompatible shape (None, 28, 28, 3).
WARNING:tensorflow:Model was constructed with shape (None, 32, 32, 3) for input Tensor("input_1:0", shape=(None, 32, 32, 3), dtype=float32), but it was called on an input with incompatible shape (None, 28, 28, 3).
469/469 [==============================] - 1s 3ms/step - loss: 0.6453 - categorical_accuracy: 0.6447
<tensorflow.python.keras.callbacks.History at 0x7f7af016feb8>

这篇关于如何在没有ImageNet权重的情况下进行转移学习?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆