在 Keras 模型中优化准确性而不是损失 [英] Optimizing for accuracy instead of loss in Keras model

查看:29
本文介绍了在 Keras 模型中优化准确性而不是损失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我正确理解了损失函数对模型的重要性,它会根据最小化损失值指导模型进行训练.因此,例如,如果我希望我的模型经过训练以获得最小的平均绝对误差,我应该使用 MAE 作为损失函数.例如,为什么有时您会看到有人想要获得尽可能高的准确度,但构建模型以最小化另一个完全不同的函数?例如:

If I correctly understood the significance of the loss function to the model, it directs the model to be trained based on minimizing the loss value. So for example, if I want my model to be trained in order to have the least mean absolute error, i should use the MAE as the loss function. Why is it, for example, sometimes you see someone wanting to achieve the best accuracy possible, but building the model to minimize another completely different function? For example:

model.compile(loss='mean_squared_error', optimizer='sgd', metrics='acc')

为什么上面的模型经过训练可以给我们最好的 acc,因为在训练过程中它会尝试最小化另一个函数 (MSE).我知道,当已经训练好后,模型的度量会给我们在训练过程中找到的最好的 acc.

How come the model above is trained to give us the best acc, since during it's training it will try to minimize another function (MSE). I know that, when already trained, the metric of the model will give us the best acc found during the training.

我的疑问是:模型在训练过程中的重点不应该是最大化 acc(或最小化 1/acc)而不是最小化 MSE?如果以这种方式完成,模型不会给我们更高的准确率,因为它知道它必须在训练期间最大化它?

My doubt is: shouldn't the focus of the model during it's training to maximize acc (or minimize 1/acc) instead of minimizing MSE? If done in that way, wouldn't the model give us even higher accuracy, since it knows it has to maximize it during it's training?

推荐答案

首先,您使用的代码片段作为示例:

To start with, the code snippet you have used as example:

model.compile(loss='mean_squared_error', optimizer='sgd', metrics='acc')

实际上无效(尽管 Keras 不会产生任何错误或警告),原因非常简单和基本:MSE 是 回归 问题的有效损失,对于其中问题的准确性毫无意义(它仅对分类问题有意义,其中 MSE 不是有效的损失函数).有关详细信息(包括代码示例),请参阅 当损失是均方误差 (MSE) 时,Keras 中哪个函数定义了准确度?;对于 scikit-learn 中的类似情况,请参阅 此线程.

is actually invalid (although Keras will not produce any error or warning) for a very simple and elementary reason: MSE is a valid loss for regression problems, for which problems accuracy is meaningless (it is meaningful only for classification problems, where MSE is not a valid loss function). For details (including a code example), see own answer in What function defines accuracy in Keras when the loss is mean squared error (MSE)?; for a similar situation in scikit-learn, see own answer in this thread.

继续您的一般性问题:在回归设置中,通常我们不需要单独的性能指标,我们通常仅使用损失函数本身来实现此目的,即正确的代码您使用的示例只是

Continuing to your general question: in regression settings, usually we don't need a separate performance metric, and we normally use just the loss function itself for this purpose, i.e. the correct code for the example you have used would simply be

model.compile(loss='mean_squared_error', optimizer='sgd')

没有指定任何metrics.我们当然可以使用 metrics='mse',但这是多余的,并不是真正需要的.有时人们使用类似的东西

without any metrics specified. We could of course use metrics='mse', but this is redundant and not really needed. Sometimes people use something like

model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mse','mae'])

即根据 MSE 损失优化模型,但除了 MSE 外,还展示了其在平均绝对误差 (MAE) 方面的性能.

i.e. optimise the model according to the MSE loss, but show also its performance in the mean absolute error (MAE) in addition to MSE.

现在,你的问题:

模型在训练过程中的重点不应该是最大化 acc(或最小化 1/acc)而不是最小化 MSE?

shouldn't the focus of the model during its training to maximize acc (or minimize 1/acc) instead of minimizing MSE?

确实有效,至少在原则上是有效的(除了对 MSE 的参考),但仅适用于分类问题,粗略地说,情况如下:我们不能使用庞大的武器库凸优化方法以直接最大化准确度,因为准确度不是可微函数;所以,我们需要一个代理可微函数作为损失.这种适合分类问题的损失函数最常见的例子是 交叉熵.

is indeed valid, at least in principle (save for the reference to MSE), but only for classification problems, where, roughly speaking, the situation is as follows: we cannot use the vast arsenal of convex optimization methods in order to directly maximize the accuracy, because accuracy is not a differentiable function; so, we need a proxy differentiable function to use as loss. The most common example of such a loss function suitable for classification problems is the cross entropy.

不出所料,您的这个问题不时出现,尽管上下文略有不同;参见例如自己的答案

Rather unsurprisingly, this question of yours pops up from time to time, albeit in slight variations in context; see for example own answers in

对于二进制分类特殊情况下损失和准确率之间的相互作用,您可能会发现我在以下线程中的回答很有用:

For the interplay between loss and accuracy in the special case of binary classification, you may find my answers in the following threads useful:

这篇关于在 Keras 模型中优化准确性而不是损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆