经过一段时间后,骰子损失变为NAN [英] Dice loss becomes NAN after some epochs

查看:89
本文介绍了经过一段时间后,骰子损失变为NAN的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一种图像分割应用程序,其中损失函数为Dice loss.问题是损失函数在某些时期后变为NAN.我正在做5折交叉验证,并检查每折的验证和训练损失.对于某些折痕,损失很快变为NAN,对于某些折痕,则需要一段时间才能到达NAN.我在损失函数公式中插入了一个常数,以避免上溢/下溢,但是仍然发生相同的问题.我的输入在[-1,1]范围内缩放.我见过有人建议使用正则化器和其他优化器,但我不明白为什么损失会首先出现在NAN上.我在下面的某些时期粘贴了损失函数以及训练和验证损失.最初,只有验证损失和验证损失的骰子分数变为NAN,但后来所有指标均变为NAN.

I am working on an image-segmentation application where the loss function is Dice loss. The issue is the the loss function becomes NAN after some epochs. I am doing 5-fold cross validation and checking validation and training losses for each fold. For some folds, the loss quickly becomes NAN and for some folds, it takes a while to reach it to NAN. I have inserted a constant in loss function formulation to avoid over/under-flow but still it the same problem occurs. My inputs are scaled within range [-1, 1]. I have seen people suggested using regularizers and different optimizers but I dont understand why the loss gets to NAN at first place. I have pasted the loss function, and training and validation losses for some epochs below. Initially only the validation loss and dice score for validation loss becomes NAN, but later all metrics becomes NAN.

def dice_loss(y_true, y_pred): #y_true--> ground-truth, y_pred-->predictions
smooth=1.
y_true_f = tf.keras.backend.flatten(y_true)
y_pred_f = tf.keras.backend.flatten(y_pred)
intersection = tf.keras.backend.sum(y_true_f * y_pred_f)
return 1-(2. * intersection +smooth) / (tf.keras.backend.sum(y_true_f) +
                                       tf.keras.backend.sum(y_pred_f) +smooth)

epoch   train_dice_score      train_loss    val_dice_score  val_loss
0       0.42387727            0.423877264   0.35388064      0.353880603
1       0.23064087            0.230640889   0.21502239      0.215022382
2       0.17881058            0.178810576   0.1767999       0.176799848
3       0.15746565            0.157465705   0.16138957      0.161389555
4       0.13828343            0.138283484   0.12770002      0.127699989
5       0.10434002            0.104340041   0.0981831       0.098183098
6       0.08013707            0.080137035   0.08188484      0.081884826
7       0.07081806            0.070818066   0.070421465     0.070421467
8       0.058371827           0.058371854   0.060712796     0.060712777
9       0.06381426            0.063814262   nan             nan
10      0.105625264           0.105625251   nan             nan
11      0.10790708            0.107907102   nan nan
12      0.10719114            0.10719115    nan nan


推荐答案

我的细分模型也遇到了同样的问题.当我同时使用骰子损失和加权交叉熵损失时,我遇到了这个问题.如果有人仍然存在相同的问题,我找到了解决方案.

I was getting same problem with my segmentation model too. I got that problem when I use both of dice loss and weighted cross entropy loss. I found a solution if somebody still has a same problem.

我一直在关注我的自定义损失,但是后来我发现计算时,nan值来自模型内部.由于relu,内在价值变高,然后变成nan.

I was focusing my custom loss but then I figure out nan value came from inside of model when calculation time. Because of relu, inner values becomes to high then become nan.

为解决这个问题,我在与relu进行每次卷积后都使用批处理规范化,它对我有用.

这篇关于经过一段时间后,骰子损失变为NAN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆