Keras自定义二进制交叉熵损失函数.获取NaN作为损失的输出 [英] Keras Custom Binary Cross Entropy Loss Function. Get NaN as output for loss

查看:492
本文介绍了Keras自定义二进制交叉熵损失函数.获取NaN作为损失的输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试编写自定义二进制互熵损失函数. 这是我的脚本:

I try writing a custom binary cross-entropy loss function. This is my script:

def my_custom_loss(y_true,y_pred):
    t_loss = (-1)*(y_true * K.log(y_pred) + (1 - y_true) * K.log(1 - y_pred))
    return K.mean(t_loss)

当我使用此损失函数运行脚本时,经过几次迭代,我得到NaN作为损失函数的输出.

When I run my script using this loss function, after few iterations, I get NaN as output for loss function.

然后,我查看了TensorFlow文档,将损失函数修改为以下内容:

Then I looked at TensorFlow documentation, I modified the loss function into the following:

 t_loss = K.max(y_pred,0)-y_pred * y_true + K.log(1+K.exp((-1)*K.abs(y_pred)))

代码运行没有任何问题. 我想知道是否有人可以提供一些解释,为什么我的第一个损失函数会给出NaN输出.

The code runs without any issue. I would like to know if someone could provide some explanation why my first loss function gives a NaN output.

二元互熵:y * log(p)+(1-y)* log(1-p)

Binary Cross-Entropy: y * log(p) + (1-y) * log(1-p)

我有sigmoid函数作为我最后一层的激活. 因此,"p"的值应在0到1之间.此范围内应存在对数.

I have sigmoid function as activation for my last layer. So the value of 'p' should be between 0 and 1. Log should exist for this range.

谢谢.

推荐答案

一个简单的二进制交叉熵实现将在0输出或大于一个输出(例如log(0) -> NaN)上遇到数值问题.您发布的公式被重新格式化为ensure stability and avoid underflow.以下推导来自 tf.nn.sigmoid_cross_entropy_with_logits .

A naive implementation of Binary Cross Entropy will suffer numerical problem on 0 output or larger than one output, eg log(0) -> NaN. The formula you posted is reformulated to ensure stability and avoid underflow. The following deduction is from tf.nn.sigmoid_cross_entropy_with_logits.

z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))
= z * -log(1 / (1 + exp(-x))) + (1 - z) * -log(exp(-x) / (1 + exp(-x)))
= z * log(1 + exp(-x)) + (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))
= z * log(1 + exp(-x)) + (1 - z) * (x + log(1 + exp(-x))
= (1 - z) * x + log(1 + exp(-x))
= x - x * z + log(1 + exp(-x))

对于x< 0,为避免exp(-x)溢出,我们重新编写了上面的

For x < 0, to avoid overflow in exp(-x), we reformulate the above

x - x * z + log(1 + exp(-x))
= log(exp(x)) - x * z + log(1 + exp(-x))
= - x * z + log(1 + exp(x))

并且实现使用等价形式:

And the implementation use the equivalient form:

max(x, 0) - x * z + log(1 + exp(-abs(x)))

这篇关于Keras自定义二进制交叉熵损失函数.获取NaN作为损失的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆