如何在执行主成分回归后提取成分以在R Caret包中进行进一步分析 [英] How to extract components after performing principal component regression for further analysis in R caret package

查看:152
本文介绍了如何在执行主成分回归后提取成分以在R Caret包中进行进一步分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含151个变量的数据集,这些变量被发现具有较高的线性关系,因此我通过执行以下操作对其执行了主成分回归:-

I had a dataset that contained 151 variables, that were found to be high in colinearility, so I performed principal component regression on it by doing the following:-

ctrl <- trainControl(method = "repeatedcv", repeats = 10, savePred = T)
model <- train(RT..seconds.~., data = cadets100, method = "pcr", trControl = ctrl)

这给了我:- RMSE = 65.7 R平方0.443

which gives me me:- RMSE = 65.7 R-squared 0.443

我只是想知道以后如何提取这些组件,以便可以说应用进一步的分析(即对它执行SVM或随机森林)

I was just wondering how I went about extracting these components after so that I could get say apply further analysis (i.e. perform SVM on it, or random forest)

推荐答案

如果您想在PC的得分之上进行SVM,RF或其他第二分类器,则有一个捷径而不是尝试重新-invent caret软件包.

If you want to do SVM, RF or whatever second classifier on top of the scores of your PCs, then there is a shortcut to that instead of trying to re-invent caret package.

您可以执行以下操作:

set.seed(1)
sigDist <- sigest(RT..seconds.~., data = cadets100, frac = 1)

svmGrid <- expand.grid(.sigma = sigDist, .C = 2^(-2:7))
set.seed(2)
svmPCAFit <- train(RT..seconds.~.,
                  method = "svmRadial",
                  tuneGrid = svmrGrid,                  
                  preProcess = c("center","scale","pca"), # if center and scale needed
                  trControl = ctrl)

通过这种方式,每次测试都会进行pca,并且将使用分数代替对SVM分类器的观察.因此,您不需要自己做,插入符号会自动为您完成.您在预处理过程中通过的所有内容都将应用于新数据集,无论是CV折叠测试还是适合保持测试集.

This way pca will be done on each fold of test, and scores will be used instead of observations for the SVM classifier. So you don't need to do it yourself, caret would do it for you automatically. All what you pass in the preProcess will by applied to the new data set whether be it a CV fold test or fitting the holdout test set.

但是,如果要执行PLS(相对于PCA是一种受监督的方法),则在将分数传递给下一个分类器之前,则必须在插入符号中自定义这样的模型(请参见在此处,您还将找到两个自定义模型,一个用于PLS-RF,一个用于PLS-LDA.

However, if you want to perform PLS, which is a supervised method as opposed to PCA, before passing the scores to the next classifier, then you have to custom such a model in caret (see here). More on examples you can study the code here also, there you will find two custom models, one for PLS-RF, and PLS-LDA.

这篇关于如何在执行主成分回归后提取成分以在R Caret包中进行进一步分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆