损失和准确率之间的关系 [英] Relationship between loss and accuracy

查看:15
本文介绍了损失和准确率之间的关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在训练 CNN 模型时,是否有可能在每个时期减少损失和降低准确性?我在训练时得到以下结果.

Is it practically possible to have decreasing loss and decreasing accuracy at each epoch when training a CNN model? I am getting the below result while training.

有人可以解释发生这种情况的可能原因吗?

Can someone explain the possible reasons why this is happening?

推荐答案

至少有 5 个原因可能导致这种行为:

There are at least 5 reasons which might cause such behavior:

  1. 异常值:假设您有 10 张完全相同的图像,其中 9 张属于 A 类,一张属于 B.在这种情况下,由于大多数示例,模型将开始为该示例分配 A 类的高概率.但是 - 来自异常值的信号可能会破坏模型的稳定性并使准确性降低.从理论上讲,模型应该稳定地为 A 类分配 90% 的分数,但它可能会持续许多 epoch.

  1. Outliers: imagine that you have 10 exactly the same images and 9 out of them belong to class A and one belongs to class B. In this case, a model will start to assign a high probability of class A to this example because of the majority of examples. But then - a signal from outlier might destabilize model and make accuracy decreasing. In theory, a model should stabilize at assigning score 90% to class A but it might last many epochs.

解决方案:为了处理此类示例,我建议您使用渐变裁剪(您可以在优化器中添加此类选项).如果您想检查这种现象是否发生 - 您可以检查您的损失分布(训练集中单个示例的损失)并寻找异常值.

Solutions: In order to deal with such examples I advise you to use gradient clipping (you may add such option in your optimizer). If you want to check if this phenomenon occurs - you may check your losses distributions (losses of individual examples from training set) and look for outliers.

偏差:现在假设您有 10 张完全相同的图像,但其中 5 张分配了 A 类和 5 - B 类强>.在这种情况下,模型将尝试为这两个类分配大约 50%-50% 的分布.现在 - 您的模型最多可以达到 50% 的准确率 - 从两个有效类中选择一个.

Bias: Now imagine that you have 10 exactly the same images but 5 of them have assigned class A and 5 - class B. In this case, a model will try to assign approximately 50%-50% distribution on both of these classes. Now - your model can achieve at most 50% of accuracy here - choosing one class out of two valid.

解决方案: 尝试增加模型容量 - 通常您有一组非常相似的图像 - 增加表达能力可能有助于区分相似的示例.不过要注意过拟合.另一种解决方案是在您的训练中尝试这个策略.如果要检查是否发生此类现象 - 检查单个示例的损失分布.如果分布偏向于更高的值 - 您可能会遭受偏差.

Solution: Try to increase the model capacity - very often you have a set of really similar images - adding expressive power might help to discriminate similar examples. Beware of overfitting though. Another solution is to try this strategy in your training. If you want to check if such phenomenon occurs - check the distribution of losses of individual examples. If a distribution would be skewed toward higher values - you are probably suffering from bias.

类不平衡:现在假设您的 90% 图像属于 A 类.在您训练的早期阶段,您的模型主要专注于将此类分配给几乎所有示例.这可能会使单个损失达到非常高的值,并使预测分布更不稳定,从而破坏模型的稳定性.

Class inbalance: Now imagine that 90% of your images belong to class A. In an early stage of your training, your model is mainly concentrating on assigning this class to almost all of examples. This might make individual losses to achieve really high values and destabilize your model by making a predicted distribution more unstable.

解决方案:再次 - 渐变裁剪.第二件事 - 耐心,尝试简单地将您的模型留在更多时期.模型应该在进一步的训练阶段学习更微妙.当然 - 尝试类平衡 - 通过分配 sample_weightsclass_weights.如果你想检查这种现象是否发生 - 检查你的类分布.

Solution: once again - gradient clipping. Second thing - patience, try simply leaving your model for more epochs. A model should learn more subtle in a further phase of training. And of course - try class balancing - by either assigning sample_weights or class_weights. If you want to check if this phenomenon occurs - check your class distribution.

正则化太强:如果您将正则化设置得太严格 - 训练过程主要集中在使您的权重具有更小的规范,而不是实际学习有趣的见解.

Too strong regularization: if you set your regularization to be too strict - a training process is mainly concentrated on making your weights to have smaller norm than actually learning interesting insights.

解决方案:添加一个 categorical_crossentropy 作为度量并观察它是否也在减少.如果不是 - 那么这意味着您的正则化过于严格 - 尝试分配较少的权重惩罚.

Solution: add a categorical_crossentropy as a metric and observe if it's also decreasing. If not - then it means that your regularization is too strict - try to assign less weight penalty then.

错误的模型设计 - 这种行为可能是由错误的模型设计引起的.为了改进您的模型,可以应用多种良好做法:

Bad model design - such behavior might be caused by a wrong model design. There are several good practices which one might apply in order to improve your model:

批量归一化 - 借助这项技术,您可以防止模型内部网络激活发生根本性变化.这使得训练更加稳定和高效.对于小批量,这可能也是正则化模型的真正方法.

Batch Normalization - thanks to this technique you are preventing your model from radical changes of inner network activations. This makes training much more stable and efficient. With a small batch size, this might be also a genuine way of regularizing your model.

梯度裁剪 - 这使您的模型训练更加稳定和高效.

Gradient clipping - this makes your model training much more stable and efficient.

减少瓶颈效应 - 阅读这篇很棒的论文并检查您的模型是否可能会遇到瓶颈问题.

Reduce bottleneck effect - read this fantastic paper and check if your model might suffer from bottleneck problem.

添加辅助分类器 - 如果您从头开始训练您的网络 - 这应该会使您的特征更有意义,并且您的训练 - 更快、更高效.

Add auxiliary classifiers - if you are training your network from scratch - this should make your features much more meaningful and your training - faster and more efficient.

这篇关于损失和准确率之间的关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆