交叉验证的Scikits混淆矩阵 [英] scikits confusion matrix with cross validation

查看:110
本文介绍了交叉验证的Scikits混淆矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用scikits接口通过交叉验证(stratifiedKfold)训练svm分类器.对于每个(k个)测试集,我得到一个分类结果.我想要一个包含所有结果的混淆矩阵.Scikits具有混淆矩阵界面:sklearn.metrics.confusion_matrix(y_true,y_pred)我的问题是如何累积y_true和y_pred值.它们是数组(numpy).是否应该根据我的k折参数定义数组的大小?对于每个结果,我应该将y_true和y-pred添加到数组中????

I am training a svm classifier with cross validation (stratifiedKfold) using the scikits interfaces. For each test set (of k), I get a classification result. I want to have a confusion matrix with all the results. Scikits has a confusion matrix interface: sklearn.metrics.confusion_matrix(y_true, y_pred) My question is how should I accumulate the y_true and y_pred values. They are arrays (numpy). Should I define the size of the arrays based on my k-fold parameter? And for each result I should add the y_true and y-pred to the array ????

推荐答案

您可以使用汇总混淆矩阵,也可以为每个CV分区计算一个矩阵,然后为每个CV分区计算平均值和标准偏差(或标准误差).矩阵作为变异性的度量.

You can either use an aggregate confusion matrix or compute one for each CV partition and compute the mean and the standard deviation (or standard error) for each component in the matrix as a measure of the variability.

对于分类报告,需要修改代码以接受二维输入,以便通过每个CV分区的预测,然后计算每个类别的平均得分和标准差.

For the classification report, the code would need to be modified to accept 2 dimensional inputs so as to pass the predictions for each CV partitions and then compute the mean scores and std deviation for each class.

这篇关于交叉验证的Scikits混淆矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆