如何通过h2o.performance了解H2OModelMetrics对象的指标 [英] How to understand the metrics of H2OModelMetrics Object through h2o.performance

查看:124
本文介绍了如何通过h2o.performance了解H2OModelMetrics对象的指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用h2o.randomForest创建模型之后,然后使用:

perf <- h2o.performance(model, test)
print(perf)

我得到以下信息(值H2OModelMetrics对象)

H2OBinomialMetrics: drf

MSE:  0.1353948
RMSE:  0.3679604
LogLoss:  0.4639761
Mean Per-Class Error:  0.3733908
AUC:  0.6681437
Gini:  0.3362873

Confusion Matrix (vertical: actual; across: predicted) 
for F1-optimal threshold:
          0    1    Error        Rate
0      2109 1008 0.323388  =1008/3117
1       257  350 0.423394    =257/607
Totals 2366 1358 0.339689  =1265/3724

Maximum Metrics: Maximum metrics at their respective thresholds
                        metric threshold    value idx
1                       max f1  0.080124 0.356234 248
2                       max f2  0.038274 0.515566 330
3                 max f0point5  0.173215 0.330006 131
4                 max accuracy  0.288168 0.839957  64
5                max precision  0.941437 1.000000   0
6                   max recall  0.002550 1.000000 397
7              max specificity  0.941437 1.000000   0
8             max absolute_mcc  0.113838 0.201161 195
9   max min_per_class_accuracy  0.071985 0.621087 262
10 max mean_per_class_accuracy  0.078341 0.626921 251

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` 
or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`

我过去通常会比较敏感性(召回率)和特异性,以比较预测模型的质量,但是根据所提供的信息,我无法理解此类指标.基于以上信息,我如何评估预测的质量?

如果我使用混淆矩阵计算此类指标,则会得到:sens=0.58spec=0.68与所提供的信息不同.

是否有办法像从caret软件包中使用confusionMatrix那样获得此类值?

对我来说,该指标更直观:

logLoss指标要

.

解决方案

h2o中的二项式分类模型返回预测为"1"的概率(p)(并且它们还会冗余地告诉您该概率它是一个"0",即1-p).

要使用此模型,您必须确定 cutoff .例如.您可以将其拆分为中间,如果p > 0.5为"1",则为"1",否则为"0".但是您可以选择其他值,并且在此报告中看到的是在不同临界值时的模型质量:这就是在阈值"列中看到的结果.极限值(记住,基于您提供的test数据)是以下两个:

5                max precision  0.941437 1.000000   0
6                   max recall  0.002550 1.000000 397

即如果将截止值指定为0.94,则它具有完美的精度;如果将截止值指定为0.00255,则它具有良好的召回率.

它显示的默认混淆矩阵正在使用以下行:

3                 max f0point5  0.173215 0.330006 131

(此问题的答案旨在更详细地解释该指标.)

我个人认为最大准确性是最直观的:

4                 max accuracy  0.288168 0.839957  64

即最大精度是指具有最低误差的阈值.

您确定最适合这些指标中的哪一个,仍然需要为实际的看不见的数据确定阈值.一种方法是根据您的测试数据使用表中的阈值(因此,如果我认为最大准确性最为重要,那么我将在实时应用程序中使用0.288的阈值.)但我发现,将测试数据和火车数据可以提供更可靠的结果.

P.S.经过一段时间的抵抗,我成为了 logloss 的粉丝.我发现针对模型的最佳对数损失(而不是针对最佳召回率,最佳精度,最佳准确性,最低MSE等进行调整)在变成现实应用时往往更加健壮.

After creating the model using h2o.randomForest, then using:

perf <- h2o.performance(model, test)
print(perf)

I get the following information (value H2OModelMetrics object)

H2OBinomialMetrics: drf

MSE:  0.1353948
RMSE:  0.3679604
LogLoss:  0.4639761
Mean Per-Class Error:  0.3733908
AUC:  0.6681437
Gini:  0.3362873

Confusion Matrix (vertical: actual; across: predicted) 
for F1-optimal threshold:
          0    1    Error        Rate
0      2109 1008 0.323388  =1008/3117
1       257  350 0.423394    =257/607
Totals 2366 1358 0.339689  =1265/3724

Maximum Metrics: Maximum metrics at their respective thresholds
                        metric threshold    value idx
1                       max f1  0.080124 0.356234 248
2                       max f2  0.038274 0.515566 330
3                 max f0point5  0.173215 0.330006 131
4                 max accuracy  0.288168 0.839957  64
5                max precision  0.941437 1.000000   0
6                   max recall  0.002550 1.000000 397
7              max specificity  0.941437 1.000000   0
8             max absolute_mcc  0.113838 0.201161 195
9   max min_per_class_accuracy  0.071985 0.621087 262
10 max mean_per_class_accuracy  0.078341 0.626921 251

Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` 
or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`

I use to look at sensitivity (recall) and specificity for comparing the quality of my prediction model, but with the information provided I am not able to understand in terms of such metrics. Based on above information how can I evaluate the quality of my prediction?

If I compute such metrics using the confusion matrix I get: sens=0.58, spec=0.68that is different from the information provided.

If there any way to get such values like we have using confusionMatrix from caret package?

For me it is more intuitive this metric:

than logLoss metric.

解决方案

The binomial classification models in h2o return a probability (p) of the prediction being a "1" (and they also, redundantly will tell you the probability of it being a "0", i.e. 1-p).

To use this model you have to decide the cutoff. E.g. you could split it down the middle, if p > 0.5 for "1", then it is "1", otherwise it is a "0". But you could choose other values, and what you are seeing in this report is the model quality at different cutoffs: that is what you are seeing in the "threshold" column. The extreme values (remember, based on the test data you have given it) are these two:

5                max precision  0.941437 1.000000   0
6                   max recall  0.002550 1.000000 397

I.e. if you specify the cutoff as 0.94, it has perfect precision, and if you specify the cutoff as 0.00255 it has perfect recall.

The default confusion matrix it shows is using this line:

3                 max f0point5  0.173215 0.330006 131

(The answer to this question looks to explain that metric in more detail.)

Personally, I find max accuracy the most intuitive:

4                 max accuracy  0.288168 0.839957  64

I.e. maximum accuracy means the threshold which has the lowest error.

Whichever of these metrics you decide is most suitable, you are still left with having to decide a threshold, for your real-world unseen data. One approach is to use the threshold from the table, based on your test data (so if I think max accuracy is most important, I would use a threshold of 0.288 in my live application.) But I've found that averaging the threshold from the test data and from the train data to give more solid results.

P.S. After resisting for a while, I've come around to being a fan of logloss. I've found models tuned for best logloss (rather than tuning for best recall, best precision, best accuracy, lowest MSE, etc, etc.) tended to be more robust when turned into real-world applications.

这篇关于如何通过h2o.performance了解H2OModelMetrics对象的指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆