如何使用Caret绘制每个交叉验证的ROC曲线 [英] How to plot ROC curves for every cross-validations using Caret
本文介绍了如何使用Caret绘制每个交叉验证的ROC曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下代码:
library(mlbench)
library(caret)
library(ggplot2)
set.seed(998)
# Prepare data ------------------------------------------------------------
data(Sonar)
my_data <- Sonar
# Cross Validation Definition ---------------------------------------------------
fitControl <-
trainControl(
method = "cv",
number = 10,
classProbs = T,
savePredictions = T,
summaryFunction = twoClassSummary
)
# Training with Random Forest ----------------------------------------------------------------
model <- train(
Class ~ .,
data = my_data,
method = "rf",
trControl = fitControl,
metric = "ROC"
)
for_lift <- data.frame(Class = model$pred$obs, rf = model$pred$R)
lift_obj <- lift(Class ~ rf, data = for_lift, class = "R")
# Plot ROC ----------------------------------------------------------------
ggplot(lift_obj$data) +
geom_line(aes(1 - Sp, Sn, color = liftModelVar)) +
scale_color_discrete(guide = guide_legend(title = "method"))
它产生了这个情节.
请注意,我正在执行10折交叉验证. ROC曲线仅产生最终平均值.
Notice that I am performing 10 fold cross-validation. The ROC curve produces there is only for the final average value.
对于每个交叉验证,我想做的就是具有10条ROC曲线. 我该如何实现?
What I want to do is to have 10 ROC curves, for each cross-validation. How can I achieve that?
推荐答案
library(mlbench)
library(caret)
library(ggplot2)
set.seed(998)
# Prepare data ------------------------------------------------------------
data(Sonar)
my_data <- Sonar
# Cross Validation Definition ---------------------------------------------------
fitControl <-
trainControl(
method = "cv",
number = 10,
classProbs = T,
savePredictions = T,
summaryFunction = twoClassSummary
)
# Training with Random Forest ----------------------------------------------------------------
model <- train(
Class ~ .,
data = my_data,
method = "rf",
trControl = fitControl,
metric = "ROC"
)
for_lift <- data.frame(Class = model$pred$obs, rf = model$pred$R, resample = model$pred$Resample)
lift_df <- data.frame()
for (fold in unique(for_lift$resample)) {
fold_df <- dplyr::filter(for_lift, resample == fold)
lift_obj_data <- lift(Class ~ rf, data = fold_df, class = "R")$data
lift_obj_data$fold = fold
lift_df = rbind(lift_df, lift_obj_data)
}
lift_obj <- lift(Class ~ rf, data = for_lift, class = "R")
# Plot ROC ----------------------------------------------------------------
ggplot(lift_df) +
geom_line(aes(1 - Sp, Sn, color = fold)) +
scale_color_discrete(guide = guide_legend(title = "Fold"))
要计算AUC:
model <- train(
Class ~ .,
data = my_data,
method = "rf",
trControl = fitControl,
metric = "ROC"
)
library(plyr)
library(MLmetrics)
ddply(model$pred, "Resample", summarise,
accuracy = Accuracy(pred, obs))
输出:
Resample accuracy
1 Fold01 0.8253968
2 Fold02 0.8095238
3 Fold03 0.8000000
4 Fold04 0.8253968
5 Fold05 0.8095238
6 Fold06 0.8253968
7 Fold07 0.8333333
8 Fold08 0.8253968
9 Fold09 0.9841270
10 Fold10 0.7936508
这篇关于如何使用Caret绘制每个交叉验证的ROC曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文