为 CV 结果在 e1071 中为 svm 生成混淆矩阵 [英] Generate a confusion matrix for svm in e1071 for CV results

查看:70
本文介绍了为 CV 结果在 e1071 中为 svm 生成混淆矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 e1071 使用 svm 进行了分类.目标是通过 dtm 中的所有其他变量来预测 type.

I did a classification with svm using e1071. The goal is to predict type through all other variables in dtm.

 dtm[140:145] %>% str()
 'data.frame':  385 obs. of  6 variables:
 $ think   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ actually: num  0 0 0 0 0 0 0 0 0 0 ...
 $ comes   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ able    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ hours   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ type    : Factor w/ 4 levels "-1","0","1","9": 4 3 3 3 4 1 4 4 4 3 ...

为了训练/测试模型,我使用了 10 折交叉验证.

To train/test the model, I used the 10-fold-cross-validation.

model <- svm(type~., dtm, cross = 10, gamma = 0.5, cost = 1)
summary(model)

Call:
svm(formula = type ~ ., data = dtm, cross = 10, gamma = 0.5, cost = 1)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  1 
     gamma:  0.5 

Number of Support Vectors:  385

 ( 193 134 41 17 )


Number of Classes:  4 

Levels: 
 -1 0 1 9

10-fold cross-validation on training data:

Total Accuracy: 50.12987 
Single Accuracies:
 52.63158 51.28205 52.63158 43.58974 60.52632 43.58974 57.89474 48.71795 
 39.47368 51.28205 

我的问题是如何为结果生成混淆矩阵?我必须将 model 的哪些列放入 table()confusionMatrix() 以获得矩阵?

My question is how can I generate a confusion matrix for the results? Which columns of model do I have to put in table()or confusionMatrix() to get the matrix?

推荐答案

据我所知,在进行交叉验证时,没有方法可以访问库 e1071 中的折叠预测.

As far as I know there is no method to access the fold predictions in library e1071 when doing cross validation.

一种简单的方法:

一些数据:

library(e1071)
library(mlbench)
data(Sonar)

生成折叠:

k <- 10
folds <- sample(rep(1:k, length.out = nrow(Sonar)), nrow(Sonar))

运行模型:

z <- lapply(1:k, function(x){
  model <- svm(Class~., Sonar[folds != x, ], gamma = 0.5, cost = 1, probability = T)
  pred <- predict(model, Sonar[folds == x, ])
  true <- Sonar$Class[folds == x]
  return(data.frame(pred = pred, true = true))
})

为所有遗漏的样本生成混淆矩阵:

to generate confusion matrix for all left out samples:

z1 <- do.call(rbind, z)
caret::confusionMatrix(z1$pred, z1$true)

为每个生成:

lapply(z, function(x){
  caret::confusionMatrix(x$pred, x$true)
})

为了重现性,在折叠创建之前设置种子.

for reproducibility set the seed prior the fold creation.

一般来说,如果您做这类事情,通常会选择更高级别的库,例如 mlr 或 caret.

In general if you do this sort of stuff often chose a higher level library such as mlr or caret.

这篇关于为 CV 结果在 e1071 中为 svm 生成混淆矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆