TensorFlow/Keras 使用特定类召回作为稀疏分类交叉熵的度量 [英] TensorFlow/Keras Using specific class recall as metric for Sparse Categorical Cross Entropy

查看:49
本文介绍了TensorFlow/Keras 使用特定类召回作为稀疏分类交叉熵的度量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

*底部更新

我正在尝试使用 3 个类中的 2 个作为度量标准,因此 A、B、C 类中的 B 类和 C 类.

(这的原始性质是我的模型在类别中高度不平衡 [~90% 是 A 类],因此当我使用准确度时,每次预测 A 类时我都会得到 ~90% 的结果)

model.compile(loss='sparse_categorical_crossentropy', #or categorical_crossentropy优化器=选择,指标=[tf.keras.metrics.Recall(class_id=1, name='recall_1'),tf.keras.metrics.Recall(class_id=2, name='recall_2')])history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])

这会抛出一个错误:

raise ValueError(形状 %s 和 %s 不兼容" % (self, other))ValueError:形状(无,3)和(无,1)不兼容

模型摘要为:

型号:顺序";_______________________________________________________层(类型)输出形状参数#==================================================================lstm (LSTM) (None, 120, 32) 19328_______________________________________________________辍学(辍学)(无、120、32)0_______________________________________________________batch_normalization (BatchNo (None, 120, 32) 128_______________________________________________________lstm_1 (LSTM) (无、120、32) 8320_______________________________________________________dropout_1 (Dropout) (None, 120, 32) 0_______________________________________________________batch_normalization_1(批处理(无、120、32)128_______________________________________________________lstm_2(LSTM)(无,32)8320_______________________________________________________dropout_2(辍学)(无,32)0_______________________________________________________batch_normalization_2(批次(无,32)128_______________________________________________________密集(Dense)(无,32)1056_______________________________________________________dropout_3(辍学)(无,32)0_______________________________________________________密集_1(密集)(无,3)99==================================================================总参数:37,507可训练参数:37,315不可训练的参数:192

请注意,如果使用:

,模型可以正常工作而不会出现错误

metrics=['accuracy']

但是this这个让我觉得有些东西没有按照 tf.metrics.SparseCategorical召回()

来自

<块引用>

tf.metrics.SparseCategoricalAccuracy()


因此,我转向了一个自定义指标,该指标陷入了其他问题的陷阱,因为我在类和装饰器方面非常文盲.

我从一个自定义指标示例中将其搞砸了(我不知道如何使用 sample_weight 所以我将其注释掉以备后用):

class RelevantRecall(tf.keras.metrics.Metric):def __init__(self, name="Relevant_Recall", **kwargs):super(RelevantRecall, self).__init__(name=name, **kwargs)self.joined_recall = self.add_weight(name=B/C Recall", initializer=zeros")def update_state(self, y_true, y_pred, sample_weight=None):y_pred = tf.argmax(y_pred,axis=1)报告字典=分类报告(y_true,y_pred,输出字典=真)# 如果 sample_weight 不是 None:# sample_weight = tf.cast(sample_weight, "float32")# values = tf.multiply(values, sample_weight)# self.joined_recall.assign_add(tf.reduce_sum(values))self.joined_recall.assign_add((float(report_dictionary['1.0']['recall'])+float(report_dictionary['2.0']['recall']))/2)定义结果(自我):返回 self.joined_recalldef reset_states(self):# 度量的状态将在每个时期开始时重置.self.joined_recall.assign(0.0)模型.编译(loss='sparse_categorical_crossentropy', #or categorical_crossentropy优化器=选择,指标=[RelevantRecall()])history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])

这个目的是返回[recall(b)+recall(c)/2]的度量.我想像 metrics=[recall(b),recall(c)] 一样分别返回两次召回会更好,但无论如何我无法让前者工作.

我收到一个 tensor bool 错误:OperatorNotAllowedInGraphError:不允许使用tf.Tensor"作为 Python 的bool":AutoGraph 确实转换了这个函数.这可能表明您正在尝试使用不受支持的功能. 谷歌搜索导致我添加:@tf.function 在我的自定义指标类上方.

这导致了新旧类类型错误:

super(RelevantRecall, self).__init__(name=name, **kwargs)类型错误:super() 参数 1 必须是类型,而不是函数

由于班级有对象,我没有看到我是如何实现的?

正如我所说的,我对这方面的所有方面都很陌生,因此对于如何使用仅选择预测类的指标来实现(以及如何最好地实现)的任何帮助将不胜感激.

如果我完全错误,请让我知道/引导我找到正确的资源

理想情况下,我想使用前一种使用 tf.keras.metrics.Recall(class_id=1.... 的方法,因为它似乎是最有效的方法.>

在模型的回调部分使用类似的函数时,我能够获得每个类的召回,但这似乎更密集,因为我必须在每个类的末尾对 val/test 数据进行 model.predict时代.也不清楚这是否告诉模型专注于改进选定的类(即在度量与回调中实现它的差异)


回调代码:

class MetricsCallback(Callback):def __init__(self, test_data, y_true):# 应该是你的类的标签编码self.y_true = y_trueself.test_data = test_datadef on_epoch_end(自我,纪元,日志=无):# 这里我们得到概率 - 更长的过程y_pred = self.model.predict(self.test_data)# 这里我们得到了实际的类y_pred = tf.argmax(y_pred,axis=1)报告字典=分类报告(self.y_true,y_pred,输出字典=真)打印(\n")打印(f"准确度:{report_dictionary['accuracy']} - 持有:{report_dictionary['0.0']['recall']} - 出售:{report_dictionary['1.0']['recall']} - 购买:{report_dictionary['2.0']['recall']}")self._data = (float(report_dictionary['1.0']['recall'])+float(report_dictionary['2.0']['recall']))/2返回metrics_callback = MetricsCallback(test_data = validation_x, y_true = validation_y)history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint, metrics_callback)


2021 年 7 月 19 日更新

  • 我已经使用 categorical_crossentropy 作为 loss 而不是 sparse_categorical_crossentropy.
  • 对我的类/目标数组进行单热编码.
  • 使用 tf 召回:[tf.keras.metrics.Recall(class_id=1, name='recall_1')

我现在使用下面的代码.

train_y = tf.one_hot(train_y, 3)验证_y = tf.one_hot(validation_y, 3)test_y = tf.one_hot(test_y, 3)模型.编译(损失='categorical_crossentropy',优化器=选择,指标=[tf.keras.metrics.Recall(class_id=1, name='No'),tf.keras.metrics.Recall(class_id=2, name='Yes')]) #tf.keras.metrics.Recall(class_id=0, name='Wait')history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])

感谢 Abhishek Prajapat

这实现了相同的总体目标,并且由于少数互斥类,可能对性能的差异/影响非常小,

但是在大量互斥类的情况下,我仍然没有使用sparse_categorical_crossentropy

实现与上述相同目标的解决方案>

解决方案

你的问题很简单.我为你整理了一个例子:

 将 tensorflow 导入为 tf从 sklearn.datasets 导入 make_classification数据 = make_classification(n_samples=1000,n_features=20,n_classes=3,n_clusters_per_class=1)模型 = tf.keras.Sequential([tf.keras.layers.InputLayer(input_shape=(20)),tf.keras.layers.Dense(3, activation='softmax')])模型.编译(loss=tf.keras.losses.CategoricalCrossentropy(), #or categorical_crossentropy优化器='亚当',指标 = [tf.keras.metrics.Recall(class_id=1)])y = tf.keras.utils.to_categorical(data[1], num_classes=3)数据集 = tf.data.Dataset.from_tensor_slices((data[0], y))数据集 = dataset.batch(10)模型拟合(数据集,纪元=10)

现在您可以看到,当您使用具有特定类 ID 的 metrics.Recall 时,您的输入 y 应该是单热编码的.因此,如果我们有 3 个类,那么对于 0,它应该是 ->[1, 0, 0] 等等 1 ->[0, 1, 0] 和 2 ->[0, 0, 1].

不使用额外内存

 将 tensorflow 导入为 tf从 sklearn.datasets 导入 make_classification数据 = make_classification(n_samples=1000,n_features=20,n_classes=3,n_clusters_per_class=1)模型 = tf.keras.Sequential([tf.keras.layers.InputLayer(input_shape=(20)),tf.keras.layers.Dense(3, activation='softmax')])模型.编译(loss=tf.keras.losses.CategoricalCrossentropy(), #or categorical_crossentropy优化器='亚当',指标 = [tf.keras.metrics.Recall(class_id=1)])定义编码(x,y):y = tf.one_hot(y, 3) # 这里3是类数返回 x, y数据集 = tf.data.Dataset.from_tensor_slices((data[0], data[1]))数据集 = 数据集.map(编码)数据集 = dataset.batch(10)模型拟合(数据集,纪元=10)

新示例 -

将 numpy 导入为 np将张量流导入为 tf从 sklearn.datasets 导入 make_classification数据 = make_classification(n_samples=1000,n_features=20,n_classes=3,n_clusters_per_class=1)模型 = tf.keras.Sequential([tf.keras.layers.InputLayer(input_shape=(20)),tf.keras.layers.Dense(3, activation='softmax')])定义编码(x,y):y = tf.one_hot(y, 3)返回 x, y数据集 = tf.data.Dataset.from_tensor_slices((data[0], data[1]))数据集 = dataset.map(编码)数据集 = dataset.batch(10)m1 = tf.keras.metrics.Recall()m2 = tf.keras.metrics.Recall()def my_recall(y_true, y_pred):实际_a = y_true[:, 1]pred_a = y_pred[:, 1]实际_b = y_true[:, 2]pred_b = y_pred[:, 2]m1.update_state(actual_a, pred_a)m2.update_state(actual_b, pred_b)返回 (m1.result() + m2.result())/2模型.编译(loss=tf.keras.losses.CategoricalCrossentropy(), #or categorical_crossentropy优化器='亚当',指标 = [my_recall])模型拟合(数据集,纪元=10)

对于您更新的问题 -

将 numpy 导入为 np将张量流导入为 tf从 sklearn.datasets 导入 make_classification数据 = make_classification(n_samples=1000,n_features=20,n_classes=3,n_clusters_per_class=1)模型 = tf.keras.Sequential([tf.keras.layers.InputLayer(input_shape=(20)),tf.keras.layers.Dense(3, activation='softmax')])数据集 = tf.data.Dataset.from_tensor_slices((data[0], data[1]))数据集 = dataset.batch(10)m1 = tf.keras.metrics.Recall()m2 = tf.keras.metrics.Recall()def my_recall(y_true, y_pred):y_true = tf.cast(y_true, dtype=tf.int32)actual_onehot = tf.one_hot(y_true, 3)actual_a = actual_onehot[1]pred_a = tf.reshape(y_pred[1], (1,3))actual_b = actual_onehot[2]pred_b = tf.reshape(y_pred[2], (1,3))m1.update_state(actual_a, pred_a)m2.update_state(actual_b, pred_b)返回 (m1.result() + m2.result())/2模型.编译(损失=tf.keras.losses.SparseCategoricalCrossentropy(),优化器='亚当',指标 = [my_recall])模型拟合(数据集,纪元=10)

*Update at bottom

I am trying to use recall on 2 of 3 classes as a metric, so class B and C from classes A,B,C.

(The original nature of this is that my model is highly imbalanced in the classes [~90% is class A], such that when I use accuracy I get results of ~90% for prediciting class A everytime)

model.compile(
              loss='sparse_categorical_crossentropy', #or categorical_crossentropy
              optimizer=opt,
              metrics=[tf.keras.metrics.Recall(class_id=1, name='recall_1'),tf.keras.metrics.Recall(class_id=2, name='recall_2')]
              )

history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])

This spits out an error:

raise ValueError("Shapes %s and %s are incompatible" % (self, other))

ValueError: Shapes (None, 3) and (None, 1) are incompatible

Model summary is:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
lstm (LSTM)                  (None, 120, 32)           19328
_________________________________________________________________
dropout (Dropout)            (None, 120, 32)           0
_________________________________________________________________
batch_normalization (BatchNo (None, 120, 32)           128
_________________________________________________________________
lstm_1 (LSTM)                (None, 120, 32)           8320
_________________________________________________________________
dropout_1 (Dropout)          (None, 120, 32)           0
_________________________________________________________________
batch_normalization_1 (Batch (None, 120, 32)           128
_________________________________________________________________
lstm_2 (LSTM)                (None, 32)                8320
_________________________________________________________________
dropout_2 (Dropout)          (None, 32)                0
_________________________________________________________________
batch_normalization_2 (Batch (None, 32)                128
_________________________________________________________________
dense (Dense)                (None, 32)                1056
_________________________________________________________________
dropout_3 (Dropout)          (None, 32)                0
_________________________________________________________________
dense_1 (Dense)              (None, 3)                 99
=================================================================
Total params: 37,507
Trainable params: 37,315
Non-trainable params: 192

Note that the model works fine without the errors if using:

metrics=['accuracy']

but this and this made me think something has not been implemented along the lines of tf.metrics.SparseCategoricalRecall()

from

tf.metrics.SparseCategoricalAccuracy()


So I diverted to a custom metric which decended into a rabbit hole of other issues as I am highly illeterate when it comes to classes and decorators.

I botched this together from an custom metric example (I have no idea how to use the sample_weight so I commented it out to come back to later):

class RelevantRecall(tf.keras.metrics.Metric):

    def __init__(self, name="Relevant_Recall", **kwargs):
        super(RelevantRecall, self).__init__(name=name, **kwargs)
        self.joined_recall = self.add_weight(name="B/C Recall", initializer="zeros")

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.argmax(y_pred, axis=1)
        report_dictionary = classification_report(y_true, y_pred, output_dict = True)

        # if sample_weight is not None:
        #     sample_weight = tf.cast(sample_weight, "float32")
        #     values = tf.multiply(values, sample_weight)
        # self.joined_recall.assign_add(tf.reduce_sum(values))

        self.joined_recall.assign_add((float(report_dictionary['1.0']['recall'])+float(report_dictionary['2.0']['recall']))/2)
 
    def result(self):
        return self.joined_recall

    def reset_states(self):
        # The state of the metric will be reset at the start of each epoch.
        self.joined_recall.assign(0.0)


model.compile(
              loss='sparse_categorical_crossentropy', #or categorical_crossentropy
              optimizer=opt,
              metrics=[RelevantRecall()]
              )


history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])

This aim is to return a metric of [recall(b)+recall(c)/2]. I'd imagine returning both recalls seperately like metrics=[recall(b),recall(c)] would be better but I can't get the former to work anyway.

I got a tensor bool error: OperatorNotAllowedInGraphError: using a 'tf.Tensor' as a Python 'bool' is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature. which googling led me to add: @tf.function above my custom metric class.

This led to a old vs new class type error:

super(RelevantRecall, self).__init__(name=name, **kwargs)
TypeError: super() argument 1 must be type, not Function

which I didn't see how I had achieved since the class has an object?

As I said I'm quite new to all aspects of this so any help on how to achieve (and how best to achieve) using a metric of only a selection of prediciton classes would be really appreciated.

OR

if I am going about this entirely wrong let me know/guide me to the correct resource please

Ideally I'd like to go with the former method of using tf.keras.metrics.Recall(class_id=1.... as it seems the neatest way if it worked.

I am able to get the recall for each class when using a similar function in the callbacks part of the model, but this seems more intensive as I have to do a model.predict on val/test data at the end of each epoch. Also unclear if this even tells the model to focus on improving the selected class (i.e difference in implementing it in metric vs callback)


Callback code:

class MetricsCallback(Callback):
    def __init__(self, test_data, y_true):
        # Should be the label encoding of your classes
        self.y_true = y_true
        self.test_data = test_data

    def on_epoch_end(self, epoch, logs=None):
        # Here we get the probabilities - longer process
        y_pred = self.model.predict(self.test_data)

        # Here we get the actual classes
        y_pred = tf.argmax(y_pred,axis=1)
        report_dictionary = classification_report(self.y_true, y_pred, output_dict = True)
        print ("\n")
  
        print (f"Accuracy: {report_dictionary['accuracy']} - Holds: {report_dictionary['0.0']['recall']} - Sells: {report_dictionary['1.0']['recall']} - Buys: {report_dictionary['2.0']['recall']}")
        self._data = (float(report_dictionary['1.0']['recall'])+float(report_dictionary['2.0']['recall']))/2
        return

metrics_callback = MetricsCallback(test_data = validation_x, y_true = validation_y)

history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint, metrics_callback) 


Update 19/07/2021

  • I have resorted to using categorical_crossentropy for loss instead of sparse_categorical_crossentropy.
  • One-hot-encoding my class/target arrays.
  • Using tf recall: [tf.keras.metrics.Recall(class_id=1, name='recall_1')

I am now using the code below.

train_y = tf.one_hot(train_y, 3)
validation_y = tf.one_hot(validation_y, 3)
test_y = tf.one_hot(test_y, 3)

model.compile(
    loss='categorical_crossentropy',
    optimizer=opt,
    metrics=[tf.keras.metrics.Recall(class_id=1, name='No'),tf.keras.metrics.Recall(class_id=2, name='Yes')]
    ) #tf.keras.metrics.Recall(class_id=0, name='Wait')

history = model.fit(train_x, train_y, batch_size=BATCH, epochs=EPOCHS, validation_data=(validation_x, validation_y), callbacks=[tensorboard, checkpoint])

Thanks to Abhishek Prajapat

This achieves the same overall goal and probably has a very small difference/impact on performance due to a small number of mutually exclusive classes,

but in the case of a very large number of mutually exclusive classes I still don't have an solution to achieving the same goal as above using sparse_categorical_crossentropy

解决方案

Your problem is quite simple. I have put together a example for you:

import tensorflow as tf
from sklearn.datasets import make_classification

data = make_classification(n_samples=1000, n_features=20, n_classes=3, n_clusters_per_class=1)

model = tf.keras.Sequential([
    tf.keras.layers.InputLayer(input_shape=(20)),
    tf.keras.layers.Dense(3, activation='softmax')
])

model.compile(
              loss=tf.keras.losses.CategoricalCrossentropy(), #or categorical_crossentropy
              optimizer='adam',
              metrics = [tf.keras.metrics.Recall(class_id=1)]
              )

y = tf.keras.utils.to_categorical(data[1], num_classes=3)

dataset = tf.data.Dataset.from_tensor_slices((data[0], y))
dataset = dataset.batch(10)

model.fit(dataset, epochs=10)

Now as you can see that when you use metrics.Recall with a particular class id then your input y should be one-hot encoded. So if we have 3 classes then for 0 it should be -> [1, 0, 0] and so on 1 -> [0, 1, 0] and 2 -> [0, 0, 1].

Without using extra memory

import tensorflow as tf
from sklearn.datasets import make_classification

data = make_classification(n_samples=1000, n_features=20, n_classes=3, n_clusters_per_class=1)

model = tf.keras.Sequential([
    tf.keras.layers.InputLayer(input_shape=(20)),
    tf.keras.layers.Dense(3, activation='softmax')
])

model.compile(
              loss=tf.keras.losses.CategoricalCrossentropy(), #or categorical_crossentropy
              optimizer='adam',
              metrics = [tf.keras.metrics.Recall(class_id=1)]
              )

def encode(x, y):
    y = tf.one_hot(y, 3) # Here 3 is the number of classes
    return x, y

dataset = tf.data.Dataset.from_tensor_slices((data[0], data[1]))
dataset = dataset.map(encode)
dataset = dataset.batch(10)

model.fit(dataset, epochs=10)

New example -

import numpy as np
import tensorflow as tf
from sklearn.datasets import make_classification

data = make_classification(n_samples=1000, n_features=20, n_classes=3, n_clusters_per_class=1)

model = tf.keras.Sequential([
    tf.keras.layers.InputLayer(input_shape=(20)),
    tf.keras.layers.Dense(3, activation='softmax')
])

def encode(x, y):
    y = tf.one_hot(y, 3)
    return x, y

dataset = tf.data.Dataset.from_tensor_slices((data[0], data[1]))
dataset = dataset.map(encode)
dataset = dataset.batch(10)

m1 = tf.keras.metrics.Recall()
m2 = tf.keras.metrics.Recall()

def my_recall(y_true, y_pred):
    
    actual_a = y_true[:, 1]
    pred_a = y_pred[:, 1]
    
    actual_b = y_true[:, 2]
    pred_b = y_pred[:, 2]
    
    m1.update_state(actual_a, pred_a)
    m2.update_state(actual_b, pred_b)
    
    return (m1.result() + m2.result())/2

model.compile(
              loss=tf.keras.losses.CategoricalCrossentropy(), #or categorical_crossentropy
              optimizer='adam',
              metrics = [my_recall]
              )

model.fit(dataset, epochs=10)

For you updated question -

import numpy as np
import tensorflow as tf
from sklearn.datasets import make_classification

data = make_classification(n_samples=1000, n_features=20, n_classes=3, n_clusters_per_class=1)

model = tf.keras.Sequential([
    tf.keras.layers.InputLayer(input_shape=(20)),
    tf.keras.layers.Dense(3, activation='softmax')
])

dataset = tf.data.Dataset.from_tensor_slices((data[0], data[1]))
dataset = dataset.batch(10)

m1 = tf.keras.metrics.Recall()
m2 = tf.keras.metrics.Recall()

def my_recall(y_true, y_pred):
    y_true = tf.cast(y_true, dtype=tf.int32)
    actual_onehot = tf.one_hot(y_true, 3)
    actual_a = actual_onehot[1]
    pred_a = tf.reshape(y_pred[1], (1,3))
    actual_b = actual_onehot[2]
    pred_b = tf.reshape(y_pred[2], (1,3))  
    m1.update_state(actual_a, pred_a)
    m2.update_state(actual_b, pred_b)   
    return (m1.result() + m2.result())/2

model.compile(
              loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              optimizer='adam',          
              metrics = [my_recall]
              )

model.fit(dataset, epochs=10)

这篇关于TensorFlow/Keras 使用特定类召回作为稀疏分类交叉熵的度量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆