为 Tensorflow 模型选择损失和指标 [英] Selecting loss and metrics for Tensorflow model

查看:50
本文介绍了为 Tensorflow 模型选择损失和指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用带有新添加的分类器的预训练 Xception 模型进行迁移学习.

I'm trying to do transfer learning, using a pretrained Xception model with a newly added classifier.

这是模型:

base_model = keras.applications.Xception(
    weights="imagenet",
    input_shape=(224,224,3),
    include_top=False
)

我使用的数据集是 oxford_flowers102 直接取自 tensorflow 数据集.这是一个数据集页面.

The dataset I'm using is oxford_flowers102 taken directly from tensorflow datasets. This is a dataset page.

我在选择某些参数时遇到问题 - 训练准确率显示出可疑的低值,或者存在错误.

I have a problem with selecting some parameters - either training accuracy shows suspiciously low values, or there's an error.

我需要帮助为这个 (oxford_flowers102) 数据集指定这个参数:

I need help with specifying this parameter, for this (oxford_flowers102) dataset:

  1. 为分类器新添加的密集层.我正在尝试:outputs = keras.layers.Dense(102, activation='softmax')(x) 我不确定是否应该在这里选择激活函数.
  2. 模型的损失函数.
  3. 指标.
  1. Newly added dense layer for the classifier. I was trying with: outputs = keras.layers.Dense(102, activation='softmax')(x) and I'm not sure whether I should select the activation function here or not.
  2. loss function for model.
  3. metrics.

我试过了:

model.compile(
    optimizer=keras.optimizers.Adam(),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[keras.metrics.Accuracy()],
)

我不确定应该是SparseCategoricalCrossentropy 还是CategoricalCrossentropyfrom_logits 参数呢?

I'm not sure whether it should be SparseCategoricalCrossentropy or CategoricalCrossentropy, and what about from_logits parameter?

我也不确定应该选择指标keras.metrics.Accuracy() 还是keras.metrics.CategoricalAccuracy()

I'm also not sure whether should I choose for metricskeras.metrics.Accuracy() or keras.metrics.CategoricalAccuracy()

我确实缺乏一些理论知识,但现在我只需要它来工作.期待您的回答!

I am definitely lacking some theoretical knowledge, but right now I just need this to work. Looking forward to your answers!

推荐答案

关于数据集:oxford_flowers102

数据集分为训练集验证集测试集.训练集和验证集每类包含 10 个图像(每个总共 1020 个图像).测试集包含剩余的 6149 个图像(每个类至少 20).

About the data set: oxford_flowers102

The dataset is divided into a training set, a validation set, and a test set. The training set and validation set each consist of 10 images per class (totaling 1020 images each). The test set consists of the remaining 6149 images (minimum 20 per class).

'test'        6,149
'train'       1,020
'validation'  1,020

如果我们检查,我们会看到

If we check, we'll see

import tensorflow_datasets as tfds
tfds.disable_progress_bar()

data, ds_info = tfds.load('oxford_flowers102', 
                          with_info=True, as_supervised=True)
train_ds, valid_ds, test_ds = data['train'], data['validation'], data['test']

for i, data in enumerate(train_ds.take(3)):
  print(i+1, data[0].shape, data[1])
1 (500, 667, 3) tf.Tensor(72, shape=(), dtype=int64)
2 (500, 666, 3) tf.Tensor(84, shape=(), dtype=int64)
3 (670, 500, 3) tf.Tensor(70, shape=(), dtype=int64)

ds_info.features["label"].num_classes
102

因此,它有 102 个类别或类,并且目标带有具有不同形状输入的整数.

So, it has 102 categories or classes and the target comes with an integer with different shapes input.

首先,如果你保留这个整数目标或标签,你应该使用 sparse_categorical_accuracy 用于准确性和 sparse_categorical_crossentropy 用于损失函数.但是,如果您将整数标签转换为 one-hot 编码向量,那么您应该使用 categorical_accuracy 以确保准确性,以及 categorical_crossentropy 用于损失函数.由于这些数据集具有整数标签,您可以选择sparse_categorical,或者您可以将标签转换为one-hot以便使用categorical.

First, if you keep this integer target or label, you should use sparse_categorical_accuracy for accuracy and sparse_categorical_crossentropy for loss function. But if you transform your integer label to a one-hot encoded vector, then you should use categorical_accuracy for accuracy, and categorical_crossentropy for loss function. As these data set have integer labels, you can choose sparse_categorical or you can transform the label to one-hot in order to use categorical.

第二,如果你设置outputs = keras.layers.Dense(102, activation='softmax')(x)到最后一层,你会得到<强>概率得分.但是如果你设置outputs = keras.layers.Dense(102)(x),那么你会得到logits.因此,如果您设置了 activations='softmax',则不应使用 from_logit = True.例如,在上面的代码中,您应该执行以下操作(这里是 一些理论给你):

Second, if you set outputs = keras.layers.Dense(102, activation='softmax')(x) to the last layer, you will get probabilities score. But if you set outputs = keras.layers.Dense(102)(x), then you will get logits. So, if you set activations='softmax', then you should not use from_logit = True. For example in your above code you should do as follows (here's some theory for you):

...
(a)
# Use softmax activation (no logits output)
outputs = keras.layers.Dense(102, activation='softmax')(x)
...
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    metrics=[keras.metrics.Accuracy()],
)

or,

(b)
# no activation, output will be logits
outputs = keras.layers.Dense(102)(x)
...
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[keras.metrics.Accuracy()],
)

第三 使用字符串标识符,例如 metrics=['acc'] , optimizer='adam'.但是在您的情况下,当您提到特定于损失函数时,您需要更具体一些.因此,您应该选择 keras.metrics.SparseCategoricalAccuracy() 而不是 keras.metrics.Accuracy() 如果您的目标是整数keras.metrics.CategoricalAccuracy() 如果您的目标是单热编码向量.

Third, keras uses string identifier such as metrics=['acc'] , optimizer='adam'. But in your case, you need to be a bit more specific as you mention loss function specific. So, instead of keras.metrics.Accuracy(), you should choose keras.metrics.SparseCategoricalAccuracy() if you target are integer or keras.metrics.CategoricalAccuracy() if your target are one-hot encoded vector.

这是一个端到端的例子.请注意,我将将整数标签转换为单热编码向量(现在,这对我来说是一个偏好问题).另外,我想要最后一层的 probabilities(不是 logits),这意味着 from_logits = False.对于所有这些,我需要在训练中选择以下参数:

Here is an end-to-end example. Note, I will transform integer labels to a one-hot encoded vector (right now, it's a matter of preference to me). Also, I want probabilities (not logits) from the last layer which means from_logits = False. And for all of these, I need to choose the following parameters in my training:

# use softmax to get probabilities 
outputs = keras.layers.Dense(102, 
                   activation='softmax')(x)

# so no logits, set it false (FYI, by default it already false)
loss = keras.losses.CategoricalCrossentropy(from_logits=False),

# specify the metrics properly 
metrics = keras.metrics.CategoricalAccuracy(),

让我们完成整个代码.

import tensorflow_datasets as tfds
tfds.disable_progress_bar()

data, ds_info = tfds.load('oxford_flowers102', 
                         with_info=True, as_supervised=True)
train_ds, valid_ds, test_ds = data['train'], data['validation'], data['test']


NUM_CLASSES = ds_info.features["label"].num_classes
train_size =  len(data['train'])

batch_size = 64
img_size = 120 

预处理和增强

import tensorflow as tf 

# pre-process functions 
def normalize_resize(image, label):
    image = tf.cast(image, tf.float32)
    image = tf.divide(image, 255)
    image = tf.image.resize(image, (img_size, img_size))
    label = tf.one_hot(label , depth=NUM_CLASSES) # int to one-hot
    return image, label

# augmentation 
def augment(image, label):
    image = tf.image.random_flip_left_right(image)
    return image, label 


train = train_ds.map(normalize_resize).cache().map(augment).shuffle(100).\
                          batch(batch_size).repeat()
valid = valid_ds.map(normalize_resize).cache().batch(batch_size)
test = test_ds.map(normalize_resize).cache().batch(batch_size)

模型

from tensorflow import keras 

base_model = keras.applications.Xception(
    weights='imagenet',  
    input_shape=(img_size, img_size, 3),
    include_top=False)  

base_model.trainable = False
inputs = keras.Input(shape=(img_size, img_size, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(NUM_CLASSES, activation='softmax')(x)
model = keras.Model(inputs, outputs)

好的,另外,这里我喜欢使用两个指标来计算 top-1top-3 准确率.

Okay, additionally, here I like to use two metrics to compute top-1 and top-3 accuracy.

model.compile(optimizer=keras.optimizers.Adam(),
              loss=keras.losses.CategoricalCrossentropy(),
              metrics=[
                       keras.metrics.TopKCategoricalAccuracy(k=3, name='acc_top3'),
                       keras.metrics.TopKCategoricalAccuracy(k=1, name='acc_top1')
                    ])
model.fit(train, steps_per_epoch=train_size // batch_size,
          epochs=20, validation_data=valid, verbose=2)

...
Epoch 19/20
15/15 - 2s - loss: 0.2808 - acc_top3: 0.9979 - acc_top1: 0.9917 - 
val_loss: 1.5025 - val_acc_top3: 0.8147 - val_acc_top1: 0.6186

Epoch 20/20
15/15 - 2s - loss: 0.2743 - acc_top3: 0.9990 - acc_top1: 0.9885 - 
val_loss: 1.4948 - val_acc_top3: 0.8147 - val_acc_top1: 0.6255

评估

# evaluate on test set 
model.evaluate(test, verbose=2)
97/97 - 18s - loss: 1.6482 - acc_top3: 0.7733 - acc_top1: 0.5994
[1.648208498954773, 0.7732964754104614, 0.5994470715522766]

这篇关于为 Tensorflow 模型选择损失和指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆