sklearnmetrics.log_loss为正vs.'neg_log_loss'为负 [英] sklearn metrics.log_loss is positive vs. scoring 'neg_log_loss' is negative
问题描述
确保我做对了:
如果我们使用 sklearn.metrics.log_loss 独立的,即log_loss(y_true,y_pred),它会产生正的分数-分数越小,性能越好.
If we use sklearn.metrics.log_loss standalone, i.e. log_loss(y_true,y_pred), it generates a positive score -- the smaller the score, the better the performance.
但是,如果我们使用"neg_log_loss" 作为"cross_val_score"中的计分方案,分数为负-分数越大,性能越好.
However, if we use 'neg_log_loss' as a scoring scheme as in 'cross_val_score", the score is negative -- the bigger the score, the better the performance.
这是由于计分方案的构建与其他计分方案保持一致.一般而言,越高越好,所以我们否定通常的
And this is due to the scoring scheme is built to be consistent with other scoring schemes. Since generally, the higher the better, we negate usual log_loss to be consistent with the trend. And it is done so solely for that purpose. Is this understanding correct?
[背景:metric.log_loss的得分为正,'neg_los_loss'的得分为负,两者均参考同一文档页面.]
推荐答案
sklearn.metrics.log_loss
是通常定义的错误度量的实现,并且大多数错误度量都是正数.在这种情况下,与诸如最大化的准确性之类的度量相反,它是通常最小化的度量(例如,作为回归的均方误差).
The sklearn.metrics.log_loss
is an implementation of the error metric as typically defined, and which is as most error metrics a positive number. In this case, it is a metric which is generally minimized (e.g. as mean squared error for regression), in contrast to metrics such as accuracy which is maximized.
因此neg_log_loss
是创建实用程序值的技术,它允许优化sklearn的函数和类以最大化此实用程序,而不必更改每个度量的函数行为(例如,包括名为cross_val_score
,GridSearchCV
,RandomizedSearchCV
等).
The neg_log_loss
is hence a technicality to create a utility value, which allows optimizing functions and classes of sklearn to maximize this utility without having to change the function's behavior for each metric (such include for instance named cross_val_score
, GridSearchCV
, RandomizedSearchCV
, and others).
这篇关于sklearnmetrics.log_loss为正vs.'neg_log_loss'为负的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!