Keras模型评估中的损失 [英] Loss in Keras Model evaluation
问题描述
我正在用Keras进行二进制分类
loss='binary_crossentropy'
,optimizer=tf.keras.optimizers.Adam
,最后一层是keras.layers.Dense(1, activation=tf.nn.sigmoid)
.
I am doing binary classification with Keras
loss='binary_crossentropy'
, optimizer=tf.keras.optimizers.Adam
and final layer is keras.layers.Dense(1, activation=tf.nn.sigmoid)
.
据我所知,loss
值用于在训练阶段评估模型.但是,当我将Keras
模型评估用于测试数据集(例如m_recall.evaluate(testData,testLabel)
时,也有loss
值,并伴随有accuracy
值,例如下面的输出
As I know, loss
value is used to evaluate the model during training phase. However, when I use Keras
model evaluation for my testing dataset (e.g. m_recall.evaluate(testData,testLabel)
, there are also loss
values, accompanied by accuracy
values like the output below
test size: (1889, 18525)
1889/1889 [==============================] - 1s 345us/step
m_acc: [0.5690245978371045, 0.9523557437797776]
1889/1889 [==============================] - 1s 352us/step
m_recall: [0.24519687695911097, 0.9359449444150344]
1889/1889 [==============================] - 1s 350us/step
m_f1: [0.502442331737344, 0.9216516675489677]
1889/1889 [==============================] - 1s 360us/step
metric name: ['loss', 'acc']
loss
在测试期间的含义/用途是什么?为什么这么高(例如m_acc
中的0.5690
)?准确性评估对我来说似乎不错(例如m_acc
中的0.9523
),但我也很担心loss
,这会使我的模型表现不佳吗?
What is the meaning/usage of loss
during testing? Why it is so high (e.g. 0.5690
in m_acc
)? The accuracy evaluation seems fine to me (e.g. 0.9523
in m_acc
) but I am concerned about the loss
too, does it make my model perform badly?
PS
m_acc
,m_recall
等只是我命名模型的方式(它们是通过GridSearchCV
中的不同度量进行训练的)
P.S.
m_acc
, m_recall
, etc. are just the way I name my models (they were trained by on different metrics in GridSearchCV
)
更新:
我只是意识到loss
值不是百分比,那么如何计算它们?而使用当前值,它们是否足够好,还是我需要对其进行更多优化?
Update:
I just realized that loss
values are not in percentage, so how are they calculated? And with current values, are they good enough or do I need to optimize them more?
对于进一步阅读的建议也表示赞赏!
Suggestions for further reading are appreciated too!
推荐答案
在定义机器学习模型时,我们希望有一种方法来衡量模型的性能,以便我们可以将其与其他模型进行比较以选择最佳模型并还请确保它足够好.因此,我们定义了一些度量标准,例如准确性(在分类的上下文中),它是模型正确分类的样本的比例,以衡量模型的性能以及该模型是否足以胜任我们的任务是否.
When defining a machine learning model, we want a way to measure the performance of our model so that we could compare it with other models to choose the best one and also make sure that it is good enough. Therefore, we define some metrics like accuracy (in the context of classification), which is the proportion of correctly classified samples by the model, to measure how our model performs and whether it is good enough for our task or not.
尽管这些指标对我们来说确实是可以理解的,但是问题是我们的模型的学习过程无法直接使用它们来调整模型的参数.取而代之的是,我们定义其他度量,通常称为损失函数或目标函数,可以在训练过程中直接使用(即优化).通常定义这些函数,以便我们期望当它们的值较低时,我们将具有较高的精度.这就是为什么您通常会看到机器学习算法在期望精度提高的情况下试图将损失函数最小化的原因.换句话说,通过优化损失函数,可以间接学习这些模型.损失值在训练模型期间很重要,例如如果它们没有减少或波动,则意味着存在某个地方需要解决的问题.
Although these metrics are truly comprehensible by us, however the problem is that they cannot be directly used by the learning process of our models to tune the parameters of the model. Instead, we define other measures, which are usually called loss functions or objective functions, which can be directly used by the training process (i.e. optimization). These functions are usually defined such that we expect that when their values are low we would have a high accuracy. That's why you would commonly see that the machine learning algorithms are trying to minimize a loss function with the expectation that the accuracy increases. In other words, the models are indirectly learning by optimizing the loss functions. The loss values are important during training of the model, e.g. if they are not decreasing or fluctuating then this means there is a problem somewhere that needs to be fixed.
因此,我们最终(即在测试模型时)关注的是我们最初定义的度量值(如准确性),我们不在乎损失函数的最终值.这就是为什么您听不到诸如"ImageNet数据集上[特定模型]的损失值"是8.732之类的原因!这并不能告诉您模型是好是坏,好坏.相反,您会听到该模型在ImageNet数据集上的准确度为87%".
As a result, what we are ultimately (i.e. when testing a model) concerned about is the value of metrics (like accuracy) we have initially defined and we don't care about the final value of loss functions. That's why you don't hear things like "the loss value of a [specific model] on the ImageNet dataset is 8.732"! That does not tell you anything whether the model is great, good, bad or terrible. Rather, you would hear that "this model performs with 87% accuracy on the ImageNet dataset".
这篇关于Keras模型评估中的损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!