从 randomForest 分类的 ROC 曲线 [英] ROC curve for classification from randomForest

查看:77
本文介绍了从 randomForest 分类的 ROC 曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 R 平台中使用 randomForest 包进行分类任务.

I am using randomForest package in R platform for classification task.

rf_object<-randomForest(data_matrix, label_factor, cutoff=c(k,1-k))

其中 k 的范围从 0.1 到 0.9.

where k ranges from 0.1 to 0.9.

pred <- predict(rf_object,test_data_matrix)

我有随机森林分类器的输出,并将其与标签进行了比较.因此,我有 9 个截止点的性能指标,例如准确性、MCC、灵敏度、特异性等.

I have the output from the random forest classifier and I compared it with the labels. So, I have the performance measures like accuracy, MCC, sensitivity, specificity, etc for 9 cutoff points.

现在,我想绘制 ROC 曲线并获得 ROC 曲线下的面积,看看性能有多好.R 中的大多数包(如 ROCR、pROC)都需要预测和标签,但我有敏感性 (TPR) 和特异性 (1-FPR).

Now, I want to plot the ROC curve and obtain the area under the ROC curve to see how good the performance is. Most of the packages in R (like ROCR, pROC) require prediction and labels but I have sensitivity (TPR) and specificity (1-FPR).

任何人都可以建议我使用截止方法是否正确或可靠地生成 ROC 曲线?你知道用TPR和FPR得到ROC曲线和曲线下面积的方法吗?

Can any one suggest me if the cutoff method is correct or reliable to produce ROC curve? Do you know any way to obtain ROC curve and area under the curve using TPR and FPR?

我也尝试使用以下命令来训练随机森林.这样预测是连续的,并且可以被 R 中的 ROCRpROC 包接受.但是,我不确定这是否是正确的方法.任何人都可以建议我使用这种方法吗?

I also tried to use the following command to train random forest. This way the predictions were continuous and were acceptable to ROCR and pROC packages in R. But, I am not sure if this is correct way to do. Can any one suggest me about this method?

rf_object <- randomForest(data_matrix, label_vector)
pred <- predict(rf_object, test_data_matrix)

感谢您花时间阅读我的问题!我花了很长时间上网冲浪.感谢您的建议/建议.

Thank you for your time reading my problem! I have spent long time surfing for this. Thank you for your suggestion/advice.

推荐答案

为什么不输出类概率?这样,您就有了预测的排名,并且可以直接将其输入到任何 ROC 包中.

Why don't you output class probabilities ? This way, you have a ranking of your predictions and you can directly input that to any ROC package.

m = randomForest(data_matrix, labels)
predict(m,newdata_matrix,type='prob')

请注意,要使用 randomForest 作为分类工具,labels 必须是因子向量.

Note that, to use randomForest as a classification tool, labels must be a vector of factor.

这篇关于从 randomForest 分类的 ROC 曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆