使用Keras理解WeightedKappaLoss [英] Understanding WeightedKappaLoss using Keras
问题描述
我正在使用Keras尝试通过一系列事件来预测分数(0-1)的向量.
例如, X 是3个向量的序列,每个向量由6个特征组成,而 y 是3个分数的向量:
X
[
[1,2,3,4,5,6], <--- dummy data
[1,2,3,4,5,6],
[1,2,3,4,5,6]
]
y
[0.34 ,0.12 ,0.46] <--- dummy data
我想按序数分类解决问题,因此,如果实际值是[0.5,0.5,0.5]
,则预测[0.49,0.49,0.49]
比[0.3,0.3,0.3]
更好.我的原始解决方案是在我的最后一层使用sigmoid
激活,并使用mse
作为损失函数,因此对于每个输出神经元,输出范围为0-1:
def get_model(num_samples, num_features, output_size):
opt = Adam()
model = Sequential()
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))
model.add(Dropout(config['dropout_rate']))
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
model.add(Dense(output_size, activation='sigmoid'))
model.compile(loss='mse', optimizer=opt, metrics=['mae', 'mse'])
return model
我的目标是了解 WeightedKappaLoss 的用法,并将其应用于我的实际数据.我创建了 此Colab 来摆弄周围的想法.在Colab中,我的数据是一个(5000,3,3)
形状的序列,而我的目标形状是(5000, 4)
,表示4种可能的类别之一.
我希望模型理解它需要修剪X的浮点才能预测正确的y类:
[[3.49877793, 3.65873511, 3.20218196],
[3.20258153, 3.7578669 , 3.83365481],
[3.9579924 , 3.41765455, 3.89652426]], ----> y is 3 [0,0,1,0]
[[1.74290875, 1.41573056, 1.31195701],
[1.89952004, 1.95459796, 1.93148095],
[1.18668981, 1.98982041, 1.89025326]], ----> y is 1 [1,0,0,0]
新型号代码:
def get_model(num_samples, num_features, output_size):
opt = Adam(learning_rate=config['learning_rate'])
model = Sequential()
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))
model.add(Dropout(config['dropout_rate']))
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
model.add(Dense(output_size, activation='softmax'))
model.compile(loss=tfa.losses.WeightedKappaLoss(num_classes=4), optimizer=opt, metrics=[tfa.metrics.CohenKappa(num_classes=4)])
return model
在拟合模型时,我可以在TensorBoard上看到以下指标:
我不确定以下几点,请您澄清一下:
- 我使用得对吗?
- 在我最初的问题中,我预测的是3分,而Colab示例中的预测是1分.如果我使用WeightedKappaLoss,是否意味着我需要转换每个分数到100个一键编码的向量?
- 有没有办法在原始浮点分数上使用WeightedKappaLoss而不转换为分类问题?
让我们将目标分为两个子目标,然后遍历目的,概念, WeightedKappaLoss
时要注意的事项
PS:如果您只关心用法,则可以跳过理解的部分
加权Kappa详细说明
由于可以将加权Kappa 视为 Cohen的Kappa +权重,因此我们需要首先了解 Cohen的Kappa
科恩河童的例子
假设我们有两个分类器(A和B)试图将50条语句分为两类(True和False),它们在列联表中将这些语句彼此分类的方式:
B
True False
A True 20 5 25 statements A think is true
False 10 15 25 statements A think is false
30 statements B think is true
20 statements B think is false
现在假设我们想知道:预测A和B的可靠性如何?
我们所能做的就是简单地获取A和B彼此同意的机密陈述所占的百分比,即观察到的同意比例表示为Po
,因此:
Po = (20 + 15) / 50 = 0.7
但这是有问题的,因为如果我们使用观察到的百分比作为期望值,则A和B有可能以随机机会彼此达成一致,即,预期机会协议的比例表示为Pe
概率,然后:
Pe = (probability statement A think is true) * (probability statement B think is true) +
(probability statement A think is false) * (probability statement B think is false)
= (25 / 50) * (30 / 50) +
(25 / 50) * (20 / 50)
= 0.5
科恩的kappa系数表示为K
,其中结合了Po
和Pe
以便为我们提供关于预测A和B的可靠性的更可靠的预测:
K = (Po - Pe) / (1 - Pe) = 1 - (1 - Po) / (1 - Pe) = 1 - (1 - 0.7) / (1 - 0.5) = 0.4
我们可以看到,A和B彼此同意的程度越高(Po
越高),而由于偶然性(Pe
则越低)他们的同意越少,科恩的kappa 思考"越多;结果是可靠的
现在假设A是语句的标签(基本事实),然后K
告诉我们B的预测有多可靠,即在考虑随机机会的情况下,多少预测与标签相符
科恩kappa的权重
我们用m
类正式定义了列联表:
classifier 2
class.1 class.2 class... class.k Sum over row
class.1 n11 n12 ... n1k n1+
class.2 n21 n22 ... n2k n2+
classifier 1 class... ... ... ... ... ...
class.k nk1 nk2 ... nkk nk+
Sum over column n+1 n+2 ... n+k N # total sum of all table cells
表格单元格包含交叉分类类别的计数,分别表示行索引和列索引为nij
,i,j
考虑到那些k
序数类与两个分类类是分开的,例如将1, 0
分为五个类1, 0.75, 0.5, 0.25, 0
,它们具有平滑的有序过渡,我们不能说这些类是独立的,除了第一个和最后一个类,例如very good, good, normal, bad, very bad
,very good
和good
不是独立的,并且good
应该更靠近bad
而不是very bad
由于相邻类是相互依存的,因此为了计算与协议相关的数量,我们需要定义此依存关系,即 Weights 表示为Wij
,它已分配给列联表中的每个单元格,权重值(在[0,1]范围内)取决于两个类别的接近程度
现在让我们看一下加权Kappa 中的Po
和Pe
公式:
科恩的kappa 中的
以及Po
和Pe
公式:
我们可以看到 Cohen的kappa 中的Po
和Pe
公式是加权Kappa 中的公式的特例,其中weight = 1
分配给了所有对角单元和权重= 0在其他地方,当我们使用加权Kappa 中的Po
和Pe
公式计算K
(科恩的kappa系数)时,我们还考虑了相邻类之间的依赖关系
以下是两种常用的加权系统:
- 线性重量:
- 二次重量:
其中,|i-j|
是类之间的距离,k
是类的数量
加权卡伯损失
这种损失用于前面提到的一个分类器是标签的情况,这种损失的目的是使模型(另一个分类器)的预测尽可能可靠,即鼓励模型使更多的预测与标签一致而在考虑相邻类之间的依赖时要少做随机猜测
加权卡伯损失的公式由:
只需采用负 Cohen卡伯系数的公式,并去除常数-1
,然后在其上应用自然对数,其中dij = |i-j|
表示 Linear重量,dij = (|i-j|)^2
表示二次重量
以下是使用tensroflow编写的 Kappa加权损失的源代码,因为您可以看到它只是实现了 Kappa加权损失的公式:
import warnings
from typing import Optional
import tensorflow as tf
from typeguard import typechecked
from tensorflow_addons.utils.types import Number
class WeightedKappaLoss(tf.keras.losses.Loss):
@typechecked
def __init__(
self,
num_classes: int,
weightage: Optional[str] = "quadratic",
name: Optional[str] = "cohen_kappa_loss",
epsilon: Optional[Number] = 1e-6,
dtype: Optional[tf.DType] = tf.float32,
reduction: str = tf.keras.losses.Reduction.NONE,
):
super().__init__(name=name, reduction=reduction)
warnings.warn(
"The data type for `WeightedKappaLoss` defaults to "
"`tf.keras.backend.floatx()`."
"The argument `dtype` will be removed in Addons `0.12`.",
DeprecationWarning,
)
if weightage not in ("linear", "quadratic"):
raise ValueError("Unknown kappa weighting type.")
self.weightage = weightage
self.num_classes = num_classes
self.epsilon = epsilon or tf.keras.backend.epsilon()
label_vec = tf.range(num_classes, dtype=tf.keras.backend.floatx())
self.row_label_vec = tf.reshape(label_vec, [1, num_classes])
self.col_label_vec = tf.reshape(label_vec, [num_classes, 1])
col_mat = tf.tile(self.col_label_vec, [1, num_classes])
row_mat = tf.tile(self.row_label_vec, [num_classes, 1])
if weightage == "linear":
self.weight_mat = tf.abs(col_mat - row_mat)
else:
self.weight_mat = (col_mat - row_mat) ** 2
def call(self, y_true, y_pred):
y_true = tf.cast(y_true, dtype=self.col_label_vec.dtype)
y_pred = tf.cast(y_pred, dtype=self.weight_mat.dtype)
batch_size = tf.shape(y_true)[0]
cat_labels = tf.matmul(y_true, self.col_label_vec)
cat_label_mat = tf.tile(cat_labels, [1, self.num_classes])
row_label_mat = tf.tile(self.row_label_vec, [batch_size, 1])
if self.weightage == "linear":
weight = tf.abs(cat_label_mat - row_label_mat)
else:
weight = (cat_label_mat - row_label_mat) ** 2
numerator = tf.reduce_sum(weight * y_pred)
label_dist = tf.reduce_sum(y_true, axis=0, keepdims=True)
pred_dist = tf.reduce_sum(y_pred, axis=0, keepdims=True)
w_pred_dist = tf.matmul(self.weight_mat, pred_dist, transpose_b=True)
denominator = tf.reduce_sum(tf.matmul(label_dist, w_pred_dist))
denominator /= tf.cast(batch_size, dtype=denominator.dtype)
loss = tf.math.divide_no_nan(numerator, denominator)
return tf.math.log(loss + self.epsilon)
def get_config(self):
config = {
"num_classes": self.num_classes,
"weightage": self.weightage,
"epsilon": self.epsilon,
}
base_config = super().get_config()
return {**base_config, **config}
加权卡伯损失的使用
只要我们能够将问题形成为序分类问题,就可以使用加权卡伯损失,即,类形成平滑的有序过渡,相邻类是相互依存的,例如排名带有very good, good, normal, bad, very bad
的内容,模型的输出应类似于Softmax
结果
当我们尝试预测得分(0-1)的矢量时,即使它们可以求和1
,我们也不能使用加权卡伯损失,因为 Weights 向量的每个元素中的都不同,这种损失不问减法所得的值有多大,而是问相乘有多少个,例如:
import tensorflow as tf
from tensorflow_addons.losses import WeightedKappaLoss
y_true = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],
[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
y_pred_0 = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],
[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
y_pred_1 = tf.constant([[0.0, 0.1, 0.9, 0.0], [0.1, 0.5, 0.3, 0.1],
[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
kappa_loss = WeightedKappaLoss(weightage='linear', num_classes=4)
loss_0 = kappa_loss(y_true, y_pred_0)
loss_1 = kappa_loss(y_true, y_pred_1)
print('Loss_0: {}, loss_1: {}'.format(loss_0.numpy(), loss_1.numpy()))
输出:
# y_pred_0 equal to y_true yet loss_1 is smaller than loss_0
Loss_0: -0.7053321599960327, loss_1: -0.8015820980072021
您的 Colab 中的代码可以正常工作序分类问题,由于您形成的函数X->Y
非常简单(X的整数是Y索引+ 1),因此模型可以相当快速,准确地学习它,正如我们看到的K
(科恩的kappa系数)达到1.0
,加权kappa损失下降到-13.0
以下(实际上,通常我们可以期望的很小)
总而言之,您可以使用加权卡伯损失,除非您可以将问题形成为序号分类问题(如果可以的话,可以用一种热门方式标记)要解决LTR(学习排名)问题,则可以检查本教程tensorflow_ranking 以获得更好的结果,否则,如果您只能将问题形成为回归问题,则不应该使用加权卡伯损失. strong>,那么您应该执行与原始解决方案相同的操作
参考:
tensroflow-addons中的WeightedKappaLoss源代码
tfa.losses.WeightedKappaLoss 的文档 >
I'm using Keras to try to predict a vector of scores (0-1) using a sequence of events.
For example, X is a sequence of 3 vectors comprised of 6 features each, while y is a vector of 3 scores:
X
[
[1,2,3,4,5,6], <--- dummy data
[1,2,3,4,5,6],
[1,2,3,4,5,6]
]
y
[0.34 ,0.12 ,0.46] <--- dummy data
I want to adress the problem as ordinal classification, so if the actual values are [0.5,0.5,0.5]
the prediction [0.49,0.49,0.49]
is better then [0.3,0.3,0.3]
. My Original solution, was to use sigmoid
activation on my last layer and mse
as the loss function, so the output is ranged between 0-1 for each of the output neurons:
def get_model(num_samples, num_features, output_size):
opt = Adam()
model = Sequential()
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))
model.add(Dropout(config['dropout_rate']))
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
model.add(Dense(output_size, activation='sigmoid'))
model.compile(loss='mse', optimizer=opt, metrics=['mae', 'mse'])
return model
My Goal is to understand the usage of WeightedKappaLoss and to implement it on my actual data. I've created this Colab to fiddle around with the idea. In the Colab, my data is a sequence shaped (5000,3,3)
and my targets shape is (5000, 4)
representing 1 of 4 possible classes.
I want the model to understand that it needs to trim the floating point of the X in order to predict the right y class:
[[3.49877793, 3.65873511, 3.20218196],
[3.20258153, 3.7578669 , 3.83365481],
[3.9579924 , 3.41765455, 3.89652426]], ----> y is 3 [0,0,1,0]
[[1.74290875, 1.41573056, 1.31195701],
[1.89952004, 1.95459796, 1.93148095],
[1.18668981, 1.98982041, 1.89025326]], ----> y is 1 [1,0,0,0]
New model code:
def get_model(num_samples, num_features, output_size):
opt = Adam(learning_rate=config['learning_rate'])
model = Sequential()
model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))
model.add(Dropout(config['dropout_rate']))
for layer in config['dense_layers']:
model.add(Dense(layer['neurons'], activation=layer['activation']))
model.add(Dense(output_size, activation='softmax'))
model.compile(loss=tfa.losses.WeightedKappaLoss(num_classes=4), optimizer=opt, metrics=[tfa.metrics.CohenKappa(num_classes=4)])
return model
When fitting the model I can see the following metrics on TensorBoard:
I'm not sure about the following points and would appreciate clarification:
- Am I using it right?
- In my original problem, I'm predicting 3 scores, as opposed of the Colab example, where I'm predicting only 1. If I'm using WeightedKappaLoss, does it mean I'll need to convert each of the scores to a vector of 100 one-hot encoding?
- Is there a way to use the WeightedKappaLoss on the original floating point scores without converting to a classification problem?
Let we separate the goal to two sub-goals, we walk through the purpose, concept, mathematical details of Weighted Kappa
first, after that we summarize the things to note when we try to use WeightedKappaLoss
in tensorflow
PS: you can skip the understand part if you only care about usage
Weighted Kappa detailed explanation
Since the Weighted Kappa can be see as Cohen's kappa + weights, so we need to understand the Cohen's kappa first
Example of Cohen's kappa
Suppose we have two classifier (A and B) trying to classify 50 statements into two categories (True and False), the way they classify those statements wrt each other in a contingency table:
B
True False
A True 20 5 25 statements A think is true
False 10 15 25 statements A think is false
30 statements B think is true
20 statements B think is false
Now suppose we want know: How reliable the prediction A and B made?
What we can do is simply take the percentage of classified statements which A and B agree with each other, i.e proportion of observed agreement denote as Po
, so:
Po = (20 + 15) / 50 = 0.7
But this is problematic, because there have probability that A and B agree with each other by random chance, i.e proportion of expected chance agreement denote as Pe
, if we use observed percentage as expect probability, then:
Pe = (probability statement A think is true) * (probability statement B think is true) +
(probability statement A think is false) * (probability statement B think is false)
= (25 / 50) * (30 / 50) +
(25 / 50) * (20 / 50)
= 0.5
Cohen's kappa coefficient denote as K
that incorporate Po
and Pe
to give us more robust prediction about reliability of prediction A and B made:
K = (Po - Pe) / (1 - Pe) = 1 - (1 - Po) / (1 - Pe) = 1 - (1 - 0.7) / (1 - 0.5) = 0.4
We can see the more A and B are agree with each other (Po
higher) and less they agree because of chance (Pe
lower), the more Cohen's kappa "think" the result is reliable
Now assume A is the labels (ground truth) of statements, then K
is telling us how reliable the B's prediction are, i.e how much prediction agree with labels when take random chance into consideration
Weights for Cohen's kappa
We define the contingency table with m
classes formally:
classifier 2
class.1 class.2 class... class.k Sum over row
class.1 n11 n12 ... n1k n1+
class.2 n21 n22 ... n2k n2+
classifier 1 class... ... ... ... ... ...
class.k nk1 nk2 ... nkk nk+
Sum over column n+1 n+2 ... n+k N # total sum of all table cells
The table cells contain the counts of cross-classified categories denote as nij
, i,j
for row and column index respectively
Consider those k
ordinal classes are separate from two categorical classes, e.g separate 1, 0
into five classes 1, 0.75, 0.5, 0.25, 0
which have a smooth ordered transition, we cannot say the classes are independent except the first and last class, e.g very good, good, normal, bad, very bad
, the very good
and good
are not independent and the good
should closer to bad
than to very bad
Since the adjacent classes are interdependent then in order to calculate the quantity related to agreement we need define this dependency, i.e Weights denote as Wij
, it assigned to each cell in the contingency table, value of weight (within range [0, 1]) depend on how close two classes are
Now let's look at Po
and Pe
formula in Weighted Kappa:
And Po
and Pe
formula in Cohen's kappa:
We can see Po
and Pe
formula in Cohen's kappa is special case of formula in Weighted Kappa, where weight = 1
assigned to all diagonal cells and weight = 0 elsewhere, when we calculate K
(Cohen's kappa coefficient) using Po
and Pe
formula in Weighted Kappa we also take dependency between adjacent classes into consideration
Here are two commonly used weighting system:
- Linear weight:
- Quadratic weight:
Where, |i-j|
is the distance between classes and k
is the number of classes
Weighted Kappa Loss
This loss is use in case we mentioned before where one classifier is the labels, and the purpose of this loss is to make the model's (another classifier) prediction as reliable as possible, i.e encourage model to make more prediction agree with labels while make less random guess when take dependency between adjacent classes into consideration
The formula of Weighted Kappa Loss given by:
It just take formula of negative Cohen's kappa coefficient and get rid of constant -1
then apply natural logarithm on it, where dij = |i-j|
for Linear weight, dij = (|i-j|)^2
for Quadratic weight
Following is the source code of Weighted Kappa Loss written with tensroflow, as you can see it just implement the formula of Weighted Kappa Loss above:
import warnings
from typing import Optional
import tensorflow as tf
from typeguard import typechecked
from tensorflow_addons.utils.types import Number
class WeightedKappaLoss(tf.keras.losses.Loss):
@typechecked
def __init__(
self,
num_classes: int,
weightage: Optional[str] = "quadratic",
name: Optional[str] = "cohen_kappa_loss",
epsilon: Optional[Number] = 1e-6,
dtype: Optional[tf.DType] = tf.float32,
reduction: str = tf.keras.losses.Reduction.NONE,
):
super().__init__(name=name, reduction=reduction)
warnings.warn(
"The data type for `WeightedKappaLoss` defaults to "
"`tf.keras.backend.floatx()`."
"The argument `dtype` will be removed in Addons `0.12`.",
DeprecationWarning,
)
if weightage not in ("linear", "quadratic"):
raise ValueError("Unknown kappa weighting type.")
self.weightage = weightage
self.num_classes = num_classes
self.epsilon = epsilon or tf.keras.backend.epsilon()
label_vec = tf.range(num_classes, dtype=tf.keras.backend.floatx())
self.row_label_vec = tf.reshape(label_vec, [1, num_classes])
self.col_label_vec = tf.reshape(label_vec, [num_classes, 1])
col_mat = tf.tile(self.col_label_vec, [1, num_classes])
row_mat = tf.tile(self.row_label_vec, [num_classes, 1])
if weightage == "linear":
self.weight_mat = tf.abs(col_mat - row_mat)
else:
self.weight_mat = (col_mat - row_mat) ** 2
def call(self, y_true, y_pred):
y_true = tf.cast(y_true, dtype=self.col_label_vec.dtype)
y_pred = tf.cast(y_pred, dtype=self.weight_mat.dtype)
batch_size = tf.shape(y_true)[0]
cat_labels = tf.matmul(y_true, self.col_label_vec)
cat_label_mat = tf.tile(cat_labels, [1, self.num_classes])
row_label_mat = tf.tile(self.row_label_vec, [batch_size, 1])
if self.weightage == "linear":
weight = tf.abs(cat_label_mat - row_label_mat)
else:
weight = (cat_label_mat - row_label_mat) ** 2
numerator = tf.reduce_sum(weight * y_pred)
label_dist = tf.reduce_sum(y_true, axis=0, keepdims=True)
pred_dist = tf.reduce_sum(y_pred, axis=0, keepdims=True)
w_pred_dist = tf.matmul(self.weight_mat, pred_dist, transpose_b=True)
denominator = tf.reduce_sum(tf.matmul(label_dist, w_pred_dist))
denominator /= tf.cast(batch_size, dtype=denominator.dtype)
loss = tf.math.divide_no_nan(numerator, denominator)
return tf.math.log(loss + self.epsilon)
def get_config(self):
config = {
"num_classes": self.num_classes,
"weightage": self.weightage,
"epsilon": self.epsilon,
}
base_config = super().get_config()
return {**base_config, **config}
Usage of Weighted Kappa Loss
We can using Weighted Kappa Loss whenever we can form our problem to Ordinal Classification Problems, i.e the classes form a smooth ordered transition and adjacent classes are interdependent, like ranking something with very good, good, normal, bad, very bad
, and the output of the model should be like Softmax
results
We cannot using Weighted Kappa Loss when we try to predict the vector of scores (0-1) even if they can sum to 1
, since the Weights in each elements of vector is different and this loss not ask how different is the value by subtract, but ask how many are the number by multiplication, e.g:
import tensorflow as tf
from tensorflow_addons.losses import WeightedKappaLoss
y_true = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],
[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
y_pred_0 = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],
[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
y_pred_1 = tf.constant([[0.0, 0.1, 0.9, 0.0], [0.1, 0.5, 0.3, 0.1],
[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
kappa_loss = WeightedKappaLoss(weightage='linear', num_classes=4)
loss_0 = kappa_loss(y_true, y_pred_0)
loss_1 = kappa_loss(y_true, y_pred_1)
print('Loss_0: {}, loss_1: {}'.format(loss_0.numpy(), loss_1.numpy()))
Outputs:
# y_pred_0 equal to y_true yet loss_1 is smaller than loss_0
Loss_0: -0.7053321599960327, loss_1: -0.8015820980072021
Your code in Colab is working correctly in the context of Ordinal Classification Problems, since the function you form X->Y
is very simple (int of X is Y index + 1), so the model learn it fairly quick and accurate, as we can see K
(Cohen's kappa coefficient) up to 1.0
and Weighted Kappa Loss drop below -13.0
(which in practice usually is minimal we can expect)
In summary, you can using Weighted Kappa Loss unless you can form your problem to Ordinal Classification Problems which have labels in one-hot fashion, if you can and trying to solve the LTR (Learning to rank) problems, then you can check this tutorial of implement ListNet and this tutorial of tensorflow_ranking for better result, otherwise you shouldn't using Weighted Kappa Loss, if you can only form your problem to Regression Problems, then you should do the same as your original solution
Reference:
Weighted Kappa in R: For Two Ordinal Variables
source code of WeightedKappaLoss in tensroflow-addons
Documentation of tfa.losses.WeightedKappaLoss
Difference between categorical, ordinal and numerical variables
这篇关于使用Keras理解WeightedKappaLoss的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!