用于多类分类的sklearn指标 [英] sklearn metrics for multiclass classification
问题描述
我已经使用sklearn执行了GaussianNB分类.我尝试使用以下代码来计算指标:
I have performed GaussianNB classification using sklearn. I tried to calculate the metrics using the following code:
print accuracy_score(y_test, y_pred)
print precision_score(y_test, y_pred)
准确度分数正常运行,但精确度分数计算显示错误为:
Accuracy score is working correctly but precision score calculation is showing error as:
ValueError:目标是多类的,但average ='binary'.请选择其他平均设置.
ValueError: Target is multiclass but average='binary'. Please choose another average setting.
由于目标是多类的,我可以得到精度,召回率等度量指标吗?
As target is multiclass, can i have the metric scores of precision, recall etc.?
推荐答案
函数调用 precision_score(y_test,y_pred)
等效于 precision_score(y_test,y_pred,pos_label = 1,平均值='binary')
.该文档( http://scikit-learn.org/stable/modules/generation/sklearn.metrics.precision_score.html )告诉我们:
The function call precision_score(y_test, y_pred)
is equivalent to precision_score(y_test, y_pred, pos_label=1, average='binary')
.
The documentation (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html) tells us:
二进制":
仅报告由pos_label指定的类的结果.仅在目标(y_ {true,pred})是二进制的情况下适用.
Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.
所以问题是您的标签不是二进制的,而是可能是一键编码的.幸运的是,还有其他选项可以处理您的数据:
So the problem is that your labels are not binary, but probably one-hot encoded. Fortunately, there are other options which should work with your data:
precision_score(y_test,y_pred,average = None)
将返回每个类的精度得分,而
precision_score(y_test, y_pred, average=None)
will return the precision scores for each class, while
precision_score(y_test,y_pred,average ='micro')
将返回总比率tp/(tp + fp)
precision_score(y_test, y_pred, average='micro')
will return the total ratio
of tp/(tp + fp)
如果您选择了 binary
以外的其他 average
选项,pos_label
参数将被忽略.
The pos_label
argument will be ignored if you choose another average
option than binary
.
这篇关于用于多类分类的sklearn指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!