没有交叉验证的StepLDA [英] StepLDA without Cross Validation

查看:174
本文介绍了没有交叉验证的StepLDA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想根据训练错误选择变量. 因此,我将trainControl中的方法设置为"none".但是,如果我在下面两次运行该函数,则会得到两个不同的错误(正确率). 在本示例中,差异不值得一提.即使这样,我也完全不会期望有任何区别.

I would like to select the variables on the basis of the training error. For that reason I set method in trainControl to "none". However, if I run the function below twice I get two different errors (correctness rates). In this exsample the difference is not worth to mention. Even so I wouldn't have expected any difference at all.

有人知道这种差异来自何处吗?

Does somebody know where this difference comes from?

library(caret)

c_1 <- trainControl(method = "none")

maxvar     <-(4) 
direction <-"forward"
tune_1     <-data.frame(maxvar,direction)

train(Species~., data=iris, method = "stepLDA", trControl=c_1, tuneGrid=tune_1)->tr

第一

`stepwise classification', using 10-fold cross-validated correctness rate of method lda'.
150 observations of 4 variables in 3 classes; direction: forward
stop criterion: assemble 4 best variables.
correctness rate: 0.96;  in: "Petal.Width";  variables (1): Petal.Width 
correctness rate: 0.96667;  in: "Sepal.Width";  variables (2): Petal.Width, Sepal.Width 
correctness rate: 0.97333;  in: "Petal.Length";  variables (3): Petal.Width, Sepal.Width, Petal.Length 
correctness rate: 0.98;  in: "Sepal.Length";  variables (4): Petal.Width, Sepal.Width, Petal.Length, Sepal.Length 

 hr.elapsed min.elapsed sec.elapsed 
       0.00        0.00        0.28 

第二

> train(Species~., data=iris, method = "stepLDA", trControl=c_1, tuneGrid=tune_1)->tr
 `stepwise classification', using 10-fold cross-validated correctness rate of method lda'.
150 observations of 4 variables in 3 classes; direction: forward
stop criterion: assemble 4 best variables.
correctness rate: 0.96;  in: "Petal.Width";  variables (1): Petal.Width 
correctness rate: 0.96;  in: "Sepal.Width";  variables (2): Petal.Width, Sepal.Width 
correctness rate: 0.96667;  in: "Petal.Length";  variables (3): Petal.Width, Sepal.Width, Petal.Length 
correctness rate: 0.98;  in: "Sepal.Length";  variables (4): Petal.Width, Sepal.Width, Petal.Length, Sepal.Length 

 hr.elapsed min.elapsed sec.elapsed 
        0.0         0.0         0.3 

推荐答案

您仍在进行10倍交叉验证.只要不设置种子,多次训练模型时,总会得到略有不同的答案.

Your are still doing 10-fold cross validation. As long as you do not set the seed you will always get a slightly different answer when you train the model multiple times.

如果运行这段代码(包括set.seed),您将获得相同的正确率.

if you run this piece of code, including the set.seed you will get the same correctness rates.

set.seed(42)
tr <- train(Species~., data=iris, method = "stepLDA", trControl=c_1, tuneGrid=tune_1)

根据评论进行

10倍交叉验证的正确率不是来自Caret,而是来自klaR软件包中的stepclass函数.

Edit based on comment:

The 10-fold cross-validated correctness rate is not coming from Caret, but from the stepclass function from the klaR package.

stepclass(x,分组,方法,改进= 0.05,maxvar = Inf, start.vars = NULL,方向= c("both","forward","backward"), 条件="CR",倍数= 10 ,cv.groups = NULL,输出= TRUE, min1var = TRUE,...)

stepclass(x, grouping, method, improvement = 0.05, maxvar = Inf, start.vars = NULL, direction = c("both", "forward", "backward"), criterion = "CR", fold = 10, cv.groups = NULL, output = TRUE, min1var = TRUE, ...)

fold参数用于交叉验证;如果"cv.groups"为 指定.

fold parameter for cross-validation; omitted if ‘cv.groups’ is specified.

如果需要,可以通过将fold参数添加到火车函数中来进行调整:

you can adjust this if you want to by just adding the fold parameter to the train function:

tr <- train(Species~., data=iris, method = "stepLDA", trControl=c_1, tuneGrid=tune_1, fold = 1)

但是1的倍数是没有意义的.您会收到很多警告和错误.

But a fold of 1 is meaningless. you will get a bunch of warnings and errors.

这篇关于没有交叉验证的StepLDA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆