与提取的最终模型相比,插入符号训练对象返回不同的预测 [英] Caret returns different predictions with caret train object than it does with the extracted final model

查看:45
本文介绍了与提取的最终模型相比,插入符号训练对象返回不同的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我更喜欢在拟合模型时使用插入符号,因为它的相对速度和预处理能力.但是,我对它如何进行预测感到有些困惑.当比较直接从训练对象做出的预测和从提取的最终模型做出的预测时,我看到了非常不同的数字.来自火车对象的预测似乎更准确.

I prefer to use caret when fitting models because of its relative speed and preprocessing capabilities. However, I'm slightly confused on how it makes predictions. When comparing predictions made directly from the train object and predictions made from the extracted final model, I'm seeing very different numbers. The predictions from the train object appear to be more accurate.

library(caret)
library(ranger)

x1 <- rnorm(100)
x2 <- rbeta(100, 1, 1)

y <- 2*x1 + x2 + 5*x1*x2

data <- data.frame(x1, x2, y)
fitRanger <- train(y ~ x1 + x2, data = data,
                   method = 'ranger', 
                   tuneLength = 1,
                   preProcess = c('knnImpute', 'center', 'scale'))

predict.data <- data.frame(x1 = rnorm(10), x2 = rbeta(10, 1, 1))
prediction1 <- predict(fitRanger, newdata = predict.data)
prediction2 <- predict(fitRanger$finalModel, data = predict.data)$prediction

results <- data.frame(prediction1, prediction2)
results

我肯定这与我如何预处理火车对象中的数据有关,但即使我预处理测试数据并使用 Ranger 模型进行预测,值也不同

I'm positive it has something to do with how I preprocess the data in the train object, but even when I preprocess the test data and use the Ranger model to make predictions, the values are different

predict.data.processed <- predict.data %>% 
                             preProcess(method = c('knnImpute', 
                                                   'center', 
                                                   'scale')) %>% .$data

results3 <- predict(fitRanger$finalModel, data = predict.data.processed)$prediction

results <- cbind(results, results3)
results

我想从 Ranger 模型中的每棵树中提取预测,而在插入符号中我无法做到这一点.有什么想法吗?

I want to extract the predictions from each individual tree in the ranger model, which I can't do in caret. Any thoughts?

推荐答案

为了从最终模型中获得与插入符号 train 相同的预测,您应该以相同的方式预处理数据.将您的示例与 set.seed(1) 一起使用:

In order to get the same predictions from the final model as with caret train you should pre-process the data in the same way. Using your example with set.seed(1):

插入符号预测:

prediction1 <- predict(fitRanger,
                       newdata = predict.data)

游侠预测最终模型.在 predict.data 上使用了插入符号预处理

ranger predict on the final model. caret pre process was used on predict.data

prediction2 <- predict(fitRanger$finalModel,
                       data = predict(fitRanger$preProcess,
                                      predict.data))$prediction

all.equal(prediction1,
          prediction2)
#output
TRUE

这篇关于与提取的最终模型相比,插入符号训练对象返回不同的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆