Keras 中的自定义损失函数应该为批次返回单个损失值还是为训练批次中的每个样本返回一系列损失? [英] Should the custom loss function in Keras return a single loss value for the batch or an arrary of losses for every sample in the training batch?

查看:52
本文介绍了Keras 中的自定义损失函数应该为批次返回单个损失值还是为训练批次中的每个样本返回一系列损失?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 tensorflow(2.3) 中学习 keras API.在 tensorflow 网站上的这个指南中,我找到了一个自定义损失函数的示例:

I'm learning keras API in tensorflow(2.3). In this guide on tensorflow website, I found an example of custom loss funciton:

    def custom_mean_squared_error(y_true, y_pred):
        return tf.math.reduce_mean(tf.square(y_true - y_pred))

此自定义损失函数中的 reduce_mean 函数将返回一个标量.

The reduce_mean function in this custom loss function will return an scalar.

这样定义损失函数合适吗?据我所知,y_truey_pred 形状的第一维是批量大小.我认为损失函数应该为批次中的每个样本返回损失值.所以损失函数应该给出一个形状为 (batch_size,) 的数组.但是上面的函数给出了整个批次的单个值.

Is it right to define loss function like this? As far as I know, the first dimension of the shapes of y_true and y_pred is the batch size. I think the loss function should return loss values for every sample in the batch. So the loss function shoud give an array of shape (batch_size,). But the above function gives a single value for the whole batch.

也许上面的例子是错误的?有人能帮我解决这个问题吗?

Maybe the above example is wrong? Could anyone give me some help on this problem?

附言为什么我认为损失函数应该返回一个数组而不是单个值?

我阅读了模型 类.当您向 Model.compile() 方法提供损失函数(请注意它是一个函数,而不是一个损失)时,该损失函数用于构造一个LossesContainer对象,该对象存储在Model.compiled_loss中.这个传递给LossesContainer 类的构造函数的损失函数再次被用来构造一个LossFunctionWrapper 对象,该对象存储在LossesContainer._losses 中.

I read the source code of Model class. When you provide a loss function (please note it's a function, not a loss class) to Model.compile() method, ths loss function is used to construct a LossesContainer object, which is stored in Model.compiled_loss. This loss function passed to the constructor of LossesContainer class is used once again to construct a LossFunctionWrapper object, which is stored in LossesContainer._losses.

根据LossFunctionWrapper 类,通过LossFunctionWrapper.__call__() 方法(继承自Loss 类)计算一个训练批次的整体损失值,即它返回整个批次的单个损失值. 但是 LossFunctionWrapper.__call__() 首先调用 LossFunctionWrapper.call() 方法来获取损失的数组训练批次中的每个样本.然后将这些损失最后平均以获得整批的单个损失值.在 LossFunctionWrapper.call() 方法中调用了提供给 Model.compile() 方法的损失函数.

According to the source code of LossFunctionWrapper class, the overall loss value for a training batch is calculated by the LossFunctionWrapper.__call__() method (inherited from Loss class), i.e. it returns a single loss value for the whole batch. But the LossFunctionWrapper.__call__() first calls the LossFunctionWrapper.call() method to obtain an array of losses for every sample in the training batch. Then these losses are fianlly averaged to get the single loss value for the whole batch. It's in the LossFunctionWrapper.call() method that the loss function provided to the Model.compile() method is called.

这就是为什么我认为自定义损失函数应该返回一系列损失,而不是单个标量值.此外,如果我们为 Model.compile() 方法编写一个自定义的 Loss 类,我们自定义的 call() 方法Loss 类也应该返回一个数组,而不是一个信号值.

That's why I think the custom loss funciton should return an array of losses, insead of a single scalar value. Besides, if we write a custom Loss class for the Model.compile() method, the call() method of our custom Loss class should also return an array, rather than a signal value.

我在 github 上打开了一个问题.已确认需要自定义损失函数来为每个样本返回一个损失值.该示例需要更新以反映这一点.

I opened an issue on github. It's confirmed that custom loss function is required to return one loss value per sample. The example will need to be updated to reflect this.

推荐答案

我打开了一个问题 在 github 上.已确认需要自定义损失函数来为每个样本返回一个损失值.该示例需要更新以反映这一点.

I opened an issue on github. It's confirmed that custom loss function is required to return one loss value per sample. The example will need to be updated to reflect this.

这篇关于Keras 中的自定义损失函数应该为批次返回单个损失值还是为训练批次中的每个样本返回一系列损失?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆