与提取的最终模型相比,插入符号训练对象返回不同的预测 [英] Caret returns different predictions with caret train object than it does with the extracted final model
问题描述
我更喜欢在拟合模型时使用插入符号,因为它的相对速度和预处理能力.但是,我对它如何进行预测感到有些困惑.当比较直接从训练对象做出的预测和从提取的最终模型做出的预测时,我看到了非常不同的数字.来自火车对象的预测似乎更准确.
I prefer to use caret when fitting models because of its relative speed and preprocessing capabilities. However, I'm slightly confused on how it makes predictions. When comparing predictions made directly from the train object and predictions made from the extracted final model, I'm seeing very different numbers. The predictions from the train object appear to be more accurate.
library(caret)
library(ranger)
x1 <- rnorm(100)
x2 <- rbeta(100, 1, 1)
y <- 2*x1 + x2 + 5*x1*x2
data <- data.frame(x1, x2, y)
fitRanger <- train(y ~ x1 + x2, data = data,
method = 'ranger',
tuneLength = 1,
preProcess = c('knnImpute', 'center', 'scale'))
predict.data <- data.frame(x1 = rnorm(10), x2 = rbeta(10, 1, 1))
prediction1 <- predict(fitRanger, newdata = predict.data)
prediction2 <- predict(fitRanger$finalModel, data = predict.data)$prediction
results <- data.frame(prediction1, prediction2)
results
我肯定这与我如何预处理火车对象中的数据有关,但即使我预处理测试数据并使用 Ranger 模型进行预测,值也不同
I'm positive it has something to do with how I preprocess the data in the train object, but even when I preprocess the test data and use the Ranger model to make predictions, the values are different
predict.data.processed <- predict.data %>%
preProcess(method = c('knnImpute',
'center',
'scale')) %>% .$data
results3 <- predict(fitRanger$finalModel, data = predict.data.processed)$prediction
results <- cbind(results, results3)
results
我想从 Ranger 模型中的每棵树中提取预测,而在插入符号中我无法做到这一点.有什么想法吗?
I want to extract the predictions from each individual tree in the ranger model, which I can't do in caret. Any thoughts?
推荐答案
为了从最终模型中获得与插入符号 train
相同的预测,您应该以相同的方式预处理数据.将您的示例与 set.seed(1)
一起使用:
In order to get the same predictions from the final model as with caret train
you should pre-process the data in the same way. Using your example with set.seed(1)
:
插入符号预测:
prediction1 <- predict(fitRanger,
newdata = predict.data)
游侠预测最终模型.在 predict.data 上使用了插入符号预处理
ranger predict on the final model. caret pre process was used on predict.data
prediction2 <- predict(fitRanger$finalModel,
data = predict(fitRanger$preProcess,
predict.data))$prediction
all.equal(prediction1,
prediction2)
#output
TRUE
这篇关于与提取的最终模型相比,插入符号训练对象返回不同的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!