如何将权重应用于 Tensorflow 中的 sigmoid 交叉熵损失函数? [英] How to apply weights to a sigmoid cross entropy loss function in Tensorflow?

查看:32
本文介绍了如何将权重应用于 Tensorflow 中的 sigmoid 交叉熵损失函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

训练数据集包含两个类 A 和 B,我们在目标标签中分别表示为 10.Out 标签数据严重偏向 0 类,该类占了大约 95% 的数据,而我们的 1 类仅占 5%.在这种情况下,我们应该如何构建我们的损失函数?

我发现 Tensorflow 有一个可以用于权重的函数:

tf.losses.sigmoid_cross_entropy

<块引用>

weights 作为损失的系数.如果提供了标量,则损失仅按给定值进行缩放.

听起来不错.我将权重设置为 2.0 以增加损失并更多地惩罚错误.

loss = loss_fn(targets, cell_outputs, weights=2.0, label_smoothing=0)

然而,不仅损失没有下降,反而增加了,数据集的最终准确率略有下降.好吧,也许我误解了,它应该是 <1.0,我尝试了一个较小的数字.这没有改变任何东西,我得到了几乎相同的损失和准确性.O_o

不用说,在相同数据集上训练的相同网络但损失权重为 0.3,在 Torch/PyTorch 中显着减少了多达 x10 倍的损失.

谁能解释一下如何在 Tensorflow 中使用损失权重?

解决方案

如果您使用标量(例如 2.0)缩放损失,那么基本上您是在乘以损失,从而乘以反向传播的梯度.这类似于增加学习率,但不完全相同,因为您也在改变正则化损失(例如权重衰减)的比率.

如果您的类严重倾斜,并且您想在计算损失时对其进行平衡,那么您必须指定一个张量作为权重,如tf.losses.sigmoid_cross_entropy():

<块引用>

weights:可选张量,其秩为 0 或与标签相同的秩,并且必须可广播到标签(即所有维度必须为 1,或与相应的损失相同)维度).

也就是说,将第 0 类的权重张量设置为 1.0,将第 1 类的权重张量设置为 10,现在假阴性"损失将被更多地计算在内.

你应该对代表性不足的阶级施加多大的权重,这是一门艺术.如果你做得过头,模型就会崩溃,并且会一直预测超重的类别.

实现相同目的的另一种方法是使用 tf.nn.weighted_cross_entropy_with_logits(),它有一个 pos_weight 参数,用于完全相同的目的.但是它在 tf.nn 中而不是 tf.losses 所以你必须手动将它添加到 loss 集合中.

通常另一种处理方法是在抽样时任意增加代表性不足的类的比例.然而,这也不应该过分.你也可以同时做这两件事.

The training dataset contains two classes A and B which we represent as 1 and 0 in our target labels correspondingly. Out labels data is heavily skewed towards class 0 which takes roughly 95% of the data while our class 1 is only 5%. How should we construct our loss function in such case?

I found Tensorflow has a function that can be used with weights:

tf.losses.sigmoid_cross_entropy

weights acts as a coefficient for the loss. If a scalar is provided, then the loss is simply scaled by the given value.

Sounds good. I set weights to 2.0 to make loss higher and punish errors more.

loss = loss_fn(targets, cell_outputs, weights=2.0, label_smoothing=0)

However, not only the loss didn't go down it increased and the final accuracy on the dataset decreased slightly. Ok, maybe I misunderstood and it should be < 1.0, I tried a smaller number. This didn't change anything, I got almost the same loss and accuracy. O_o

Needless to say that same network trained on the same dataset but with loss weight 0.3 significantly reduces the loss up to x10 times in Torch / PyTorch.

Can somebody please explain how to use loss weights in Tensorflow?

解决方案

If you're scaling the loss with a scalar, like 2.0, then basically you're multiplying the loss and therefore the gradient for backpropagation. It's similar to increasing the learning rate, but not exactly the same, because you're also changing the ratio to regularization losses such as weight decay.

If your classes are heavily skewed, and you want to balance it at the calculation of loss, then you have to specify a tensor as weight, as described in the manual for tf.losses.sigmoid_cross_entropy():

weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to labels (i.e., all dimensions must be either 1, or the same as the corresponding losses dimension).

That is make the weights tensor 1.0 for class 0, and maybe 10 for class 1, and now "false negative" losses will be much more heavily counted.

It is an art how much you should over-weigh the underrepresented class. If you overdo it, the model will collapse and will predict the over-weighted class all the time.

An alternative to achieve the same thing is using tf.nn.weighted_cross_entropy_with_logits(), which has a pos_weight argument for the exact same purpose. But it's in tf.nn not tf.losses so you have to manually add it to the losses collection.

Generally another method to handle this is to arbitrarily increase the proportion of the underrepresented class at sampling. That should not be overdone either, however. You can do both of these things too.

这篇关于如何将权重应用于 Tensorflow 中的 sigmoid 交叉熵损失函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆