优化精度而不是Keras模型中的损失 [英] Optimizing for accuracy instead of loss in Keras model

查看:163
本文介绍了优化精度而不是Keras模型中的损失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我正确理解了损失函数对模型的重要性,它会基于最小化损失值来指导模型进行训练.因此,例如,如果我希望对模型进行训练以使平均绝对误差最小,则应使用MAE作为损失函数.例如,为什么有时您看到有人想要获得最佳的准确性,却建立模型以最小化另一个完全不同的功能,为什么呢?例如:

If I correctly understood the significance of the loss function to the model, it directs the model to be trained based on minimizing the loss value. So for example, if I want my model to be trained in order to have the least mean absolute error, i should use the MAE as the loss function. Why is it, for example, sometimes you see someone wanting to achieve the best accuracy possible, but building the model to minimize another completely different function? For example:

model.compile(loss='mean_squared_error', optimizer='sgd', metrics='acc')

如何训练上面的模型来给我们提供最好的acc,因为在训练过程中它将尝试最小化另一个功能(MSE).我知道,经过培训后,模型的指标将为我们提供培训期间发现的最佳acc.

How come the model above is trained to give us the best acc, since during it's training it will try to minimize another function (MSE). I know that, when already trained, the metric of the model will give us the best acc found during the training.

我的疑问是:模型训练过程中的重点是否应该最大化acc(或最小化1/acc)而不是最小化MSE?如果采用这种方式,由于模型知道在训练过程中必须将其最大化,因此模型不会为我们提供更高的准确性吗?

My doubt is: shouldn't the focus of the model during it's training to maximize acc (or minimize 1/acc) instead of minimizing MSE? If done in that way, wouldn't the model give us even higher accuracy, since it knows it has to maximize it during it's training?

推荐答案

首先,以您作为示例的代码段为例:

To start with, the code snippet you have used as example:

model.compile(loss='mean_squared_error', optimizer='sgd', metrics='acc')

实际上是无效(尽管Keras不会产生任何错误或警告),原因很简单:MSE是回归问题的有效损失,为此问题的准确性是没有意义的(仅对分类问题有意义,因为MSE不是有效的损失函数).有关详细信息(包括代码示例),请参见此线程.

is actually invalid (although Keras will not produce any error or warning) for a very simple and elementary reason: MSE is a valid loss for regression problems, for which problems accuracy is meaningless (it is meaningful only for classification problems, where MSE is not a valid loss function). For details (including a code example), see own answer in What function defines accuracy in Keras when the loss is mean squared error (MSE)?; for a similar situation in scikit-learn, see own answer in this thread.

继续回答您的一般问题:在回归设置中,通常我们不需要单独的性能指标,为此,我们通常仅使用损失函数本身,即用于您曾经使用过的示例就是

Continuing to your general question: in regression settings, usually we don't need a separate performance metric, and we normally use just the loss function itself for this purpose, i.e. the correct code for the example you have used would simply be

model.compile(loss='mean_squared_error', optimizer='sgd')

,未指定任何metrics.我们当然可以使用metrics='mse',但这是多余的,并不是真正需要的.有时人们会使用类似的

without any metrics specified. We could of course use metrics='mse', but this is redundant and not really needed. Sometimes people use something like

model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mse','mae'])

即根据MSE损失对模型进行优化,但除MSE之外,还显示其在平均绝对误差(MAE)中的性能.

i.e. optimise the model according to the MSE loss, but show also its performance in the mean absolute error (MAE) in addition to MSE.

现在,您的问题:

模型训练过程中的重点应该不是最大化acc(或最小化1/acc),而不是最小化MSE吗?

shouldn't the focus of the model during its training to maximize acc (or minimize 1/acc) instead of minimizing MSE?

确实是有效的,至少在原则上(保留对MSE的引用),但仅用于分类问题,在这种情况下,大致来说,情况如下:我们无法使用庞大的军械库凸优化方法是为了直接使精度最大化,因为精度不是微分函数;因此,我们需要代理可区分函数以用作损失.适用于分类问题的此类损失函数最常见的示例是交叉熵.

is indeed valid, at least in principle (save for the reference to MSE), but only for classification problems, where, roughly speaking, the situation is as follows: we cannot use the vast arsenal of convex optimization methods in order to directly maximize the accuracy, because accuracy is not a differentiable function; so, we need a proxy differentiable function to use as loss. The most common example of such a loss function suitable for classification problems is the cross entropy.

不出所料,这个问题会时不时地出现,尽管情况略有不同.例如查看自己的答案

Rather unsurprisingly, this question of yours pops up from time to time, albeit in slight variations in context; see for example own answers in

  • Cost function training target versus accuracy desired goal
  • Targeting a specific metric to optimize in tensorflow

对于在二进制分类的特殊情况下损失和准确性之间的相互作用,您可能会在以下有用的线程中找到我的答案:

For the interplay between loss and accuracy in the special case of binary classification, you may find my answers in the following threads useful:

  • Loss & accuracy - Are these reasonable learning curves?
  • How does Keras evaluate the accuracy?

这篇关于优化精度而不是Keras模型中的损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆