支持向量机 train caret error kernlab 类概率计算失败;返回 NA [英] support vector machine train caret error kernlab class probability calculations failed; returning NAs

查看:41
本文介绍了支持向量机 train caret error kernlab 类概率计算失败;返回 NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些数据,Y 变量是一个因素 - 好或坏.我正在使用caret"包中的train"方法构建支持向量机.使用train"函数,我能够确定各种调整参数的值并获得最终的支持向量机.对于测试数据,我可以预测类".但是当我尝试预测测试数据的概率时,我得到以下错误(例如我的模型告诉我测试数据中的第一个数据点有 y='good',但我想知道获得good"的概率是多少...通常在支持向量机的情况下,模型将计算预测概率..如果 Y 变量有 2 个结果,则模型将预测每个结果的概率.具有最大概率的结果被视为最终解决方案)

i have some data and Y variable is a factor - Good or Bad. I am building a Support vector machine using 'train' method from 'caret' package. Using 'train' function i was able to finalize values of various tuning parameters and got the final Support vector machine . For the test data i can predict the 'class'. But when i try to predict probabilities for test data, i get below error (for example my model tells me that 1st data point in test data has y='good', but i want to know what is the probability of getting 'good' ...generally in case of support vector machine, model will calculate probability of prediction..if Y variable has 2 outcomes then model will predict probability of each outcome. The outcome which has the maximum probability is considered as the final solution)

**Warning message:  
In probFunction(method, modelFit, ppUnk) :  
  kernlab class probability calculations failed; returning NAs**

示例代码如下

library(caret)
trainset <- data.frame( 
     class=factor(c("Good",    "Bad",   "Good", "Good", "Bad",  "Good", "Good", "Good", "Good", "Bad",  "Bad",  "Bad")),
     age=c(67,  22, 49, 45, 53, 35, 53, 35, 61, 28, 25, 24))

testset <- data.frame( 
     class=factor(c("Good",    "Bad",   "Good"  )),
    age=c(64,   23, 50))



library(kernlab)
set.seed(231)

### finding optimal value of a tuning parameter
sigDist <- sigest(class ~ ., data = trainset, frac = 1)
### creating a grid of two tuning parameters, .sigma comes from the earlier line. we are trying to find best value of .C
svmTuneGrid <- data.frame(.sigma = sigDist[1], .C = 2^(-2:7))

set.seed(1056)
svmFit <- train(class ~ .,
                data = trainset,
                method = "svmRadial",
                preProc = c("center", "scale"),
                tuneGrid = svmTuneGrid,
                trControl = trainControl(method = "repeatedcv", repeats = 5))

### svmFit finds the optimal values of tuning parameters and builds the model using the best parameters

### to predict class of test data
predictedClasses <- predict(svmFit, testset )
str(predictedClasses)


### predict probablities but i get an error
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")
head(predictedProbs)

此行下方的新问题:根据以下输出,有 9 个支持向量.如何识别 12 个训练数据点中的 9 个?

svmFit$finalModel

类ksvm"的支持向量机对象

Support Vector Machine object of class "ksvm"

SV 类型:C-svc(分类)参数:成本 C = 1

SV type: C-svc (classification) parameter : cost C = 1

高斯径向基核函数.超参数:西格玛 = 0.72640759446315

Gaussian Radial Basis kernel function. Hyperparameter : sigma = 0.72640759446315

支持向量的数量:9

目标函数值:-5.6994训练误差:0.083333

Objective Function Value : -5.6994 Training error : 0.083333

推荐答案

在列车控制语句中,您必须指定是否希望返回类概率 classProbs = TRUE.

In the train control statement, you have to specify if you want the class probabilities classProbs = TRUE returned.

svmFit <- train(class ~ .,
    data = trainset,
    method = "svmRadial",
    preProc = c("center", "scale"),
    tuneGrid = svmTuneGrid,
    trControl = trainControl(method = "repeatedcv", repeats = 5, 
classProbs =  TRUE))

predictedClasses <- predict(svmFit, testset )
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")

给出在测试数据集中属于 Bad 或 Good 类的概率:

giving the probabilities of being in the Bad or Good class in the test dataset as:

print(predictedProbs)
    Bad      Good
1 0.2302979 0.7697021
2 0.7135050 0.2864950
3 0.2230889 0.7769111

编辑

要回答您的新问题,您可以使用 alphaindex(svmFit$finalModel) 和系数 coef(svmFit$finalModel) 访问原始数据集中支持向量的位置代码>.

EDIT

To answer your new question, you can access the position of the support vectors in your original data set with alphaindex(svmFit$finalModel) with coefficients coef(svmFit$finalModel).

这篇关于支持向量机 train caret error kernlab 类概率计算失败;返回 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆