自定义指标(hmeasure)用于汇总功能插入符号分类 [英] Custom metric (hmeasure) for summaryFunction caret classification

查看:143
本文介绍了自定义指标(hmeasure)用于汇总功能插入符号分类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用hmeasure指标 Hand,2009 作为我的自定义指标来训练插入符号中的SVM。由于我刚开始使用R,因此尝试调整twoClassSummary函数。我需要做的是将模型(svm)中的真实类别标签和预测的类别概率从 hmeasure 包传递给HMeasure函数,而不是在插入符中使用ROC或其他分类性能度量。

I am trying to use the hmeasure metric Hand,2009 as my custom metric for training SVMs in caret. As I am relatively new to using R, I tried adapt the twoClassSummary function. All I need is to pass the true class labels and predicted class probability from the model (an svm) to the HMeasure function from the hmeasure package instead of using ROC or other measures of classification performance in caret.

例如,调用R中的HMeasure函数-HMeasure(true.class,predictedProbs [,2])-会计算Hmeasure。使用下面的twoClassSummary代码改编会导致返回错误: x必须为数字。

For example, a call to the HMeasure function in R - HMeasure(true.class,predictedProbs[,2])- results in calculation of the Hmeasure. Using an adaptation of twoClassSummary code below results in an error being returned: 'x' must be numeric.

也许训练函数无法看到评估HMeasure函数的预测概率。我怎样才能解决这个问题?

Maybe that train function can't "see" the predicted probabilities to evaluate the HMeasure function. How can I fix this?

我已经阅读了文档,并链接了关于问题的问题 。那给了我一些帮助。对于任何帮助或指向解决方案的指示,我将不胜感激。

I ve read the documentation, and linked questions posed on SO dealing with regression. Thats got me some of the way. I would be grateful for any help or pointers towards a solution.

library(caret)
library(doMC)
library(hmeasure)
library(mlbench)

set.seed(825)

data(Sonar)
table(Sonar$Class) 
inTraining <- createDataPartition(Sonar$Class, p = 0.75, list = FALSE)
training <- Sonar[inTraining, ]
testing <- Sonar[-inTraining, ]


# using caret
fitControl <- trainControl(method = "repeatedcv",number = 2,repeats=2,summaryFunction=twoClassSummary,classProbs=TRUE)

svmFit1 <- train(Class ~ ., data = training,method = "svmRadial",trControl =    fitControl,preProc = c("center", "scale"),tuneLength = 8,metric = "ROC")

predictedProbs <- predict(svmFit1, newdata = testing , type = "prob")
true.class<-testing$Class
hmeas<- HMeasure(true.class,predictedProbs[,2]) # suppose its Rocks we're interested in predicting
hmeasure.probs<-hmeas$metrics[c('H')] # returns the H measure metric 

hmeasureCaret<-function (data, lev = NULL, model = NULL,...) 
{ 
# adaptation of twoClassSummary
require(hmeasure)
if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
 stop("levels of observed and predicted data do not match")
#lev is a character string that has the outcome factor levels taken from the training   data
hObject <- try(hmeasure::HMeasure(data$obs, data[, lev[1]]),silent=TRUE)
hmeasH <- if (class(hObject)[1] == "try-error") {
NA
} else {hObject$metrics[[1]]  #hObject$metrics[c('H')] returns a dataframe, need to    return a vector 
}
out<-hmeasH 
names(out) <- c("Hmeas")
#class(out)
}
environment(hmeasureCaret) <- asNamespace('caret')

非下面的工作代码。

ctrl <- trainControl(method = "cv", summaryFunction = hmeasureCaret,classProbs=TRUE,allowParallel = TRUE,
                  verboseIter=TRUE,returnData=FALSE,savePredictions=FALSE)
set.seed(1)

svmTune <- train(Class.f ~ ., data = training,method = "svmRadial",trControl =    ctrl,preProc = c("center", "scale"),tuneLength = 8,metric="Hmeas",
              verbose = FALSE)


推荐答案

此代码有效。我正在发布一个解决方案,以防其他人想要使用/改进。
问题是由于对Hmeasure对象的错误引用以及对函数返回值的错字/注释引起的。

This code works. I m posting a solution in case anyone else wants to use/improve upon this. The problems were caused by incorrect referencing of the Hmeasure object and a typo/comment on the returned value of the function.

library(caret)
library(doMC)
library(hmeasure)
library(mlbench)

set.seed(825)
registerDoMC(cores = 4)

data(Sonar)
table(Sonar$Class) 

inTraining <- createDataPartition(Sonar$Class, p = 0.5, list = FALSE)
training <- Sonar[inTraining, ]
testing <- Sonar[-inTraining, ]

hmeasureCaret<-function (data, lev = NULL, model = NULL,...) 
{ 
  # adaptation of twoClassSummary
  require(hmeasure)
  if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
    stop("levels of observed and predicted data do not match")
  hObject <- try(hmeasure::HMeasure(data$obs, data[, lev[1]]),silent=TRUE)
  hmeasH <- if (class(hObject)[1] == "try-error") {
    NA
  } else {hObject$metrics[[1]]  #hObject$metrics[c('H')] returns a dataframe, need to return a vector 
  }
  out<-hmeasH 
  names(out) <- c("Hmeas")
  out 
}
#environment(hmeasureCaret) <- asNamespace('caret')


ctrl <- trainControl(method = "repeatedcv",number = 10, repeats = 5, summaryFunction = hmeasureCaret,classProbs=TRUE,allowParallel = TRUE,
                     verboseIter=FALSE,returnData=FALSE,savePredictions=FALSE)
set.seed(123)

svmTune <- train(Class ~ ., data = training,method = "svmRadial",trControl = ctrl,preProc = c("center", "scale"),tuneLength = 15,metric="Hmeas",
                 verbose = FALSE)
svmTune

predictedProbs <- predict(svmTune, newdata = testing , type = "prob")

true.class<-testing$Class

hmeas.check<- HMeasure(true.class,predictedProbs[,2])

summary(hmeas.check)

这篇关于自定义指标(hmeasure)用于汇总功能插入符号分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆