r 随机森林错误 - 新数据中的预测变量类型不匹配 [英] r random forest error - type of predictors in new data do not match

查看：67 发布时间：2021/7/2 20:05:02 r random-forest

本文介绍了r 随机森林错误 - 新数据中的预测变量类型不匹配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图在 R (quantregForest) 中使用分位数回归森林函数建立在随机森林包上.我收到一个类型不匹配错误，我不太明白为什么.

I am trying to use quantile regression forest function in R (quantregForest) which is built on Random Forest package. I am getting a type mismatch error that I can't quite figure why.

我使用以下方法训练模型

I train the model by using

qrf <- quantregForest(x = xtrain, y = ytrain)

工作没有问题，但是当我尝试使用新数据进行测试时

which works without a problem, but when I try to test with new data like

quant.newdata <- predict(qrf, newdata= xtest)

它给出了以下错误:

Error in predict.quantregForest(qrf, newdata = xtest) : 
Type of predictors in new data do not match types of the training data.

我的训练和测试数据来自不同的文件(因此是不同的数据框)，但格式相同.我已经用

My training and testing data are coming from separate files (hence separate data frames) but having the same format. I have checked the classes of the predictors with

sapply(xtrain, class)
sapply(xtest, class)

输出如下:

> sapply(xtrain, class)
pred1     pred2     pred3     pred4     pred5     pred6     pred7     pred8 
"factor" "integer" "integer" "integer"  "factor"  "factor" "integer"  "factor" 
pred9    pred10    pred11    pred12 
"factor"  "factor"  "factor"  "factor" 


> sapply(xtest, class)
pred1     pred2     pred3     pred4     pred5     pred6     pred7     pred8 
"factor" "integer" "integer" "integer"  "factor"  "factor" "integer"  "factor" 
pred9    pred10    pred11    pred12 
"factor"  "factor"  "factor"  "factor"

它们完全一样.我还检查了NA"值.xtrain 和 xtest 中都没有 NA 值.我在这里遗漏了一些微不足道的东西吗?

They are exactly the same. I also checked for the "NA" values. Neither xtrain nor xtest has a NA value in it. Am I missing something trivial here?

更新一:在训练数据上运行预测仍然给出相同的错误

Update I: running the prediction on the training data still gives the same error

> quant.newdata <- predict(qrf, newdata = xtrain)
Error in predict.quantregForest(qrf, newdata = xtrain) : 
names of predictor variables do not match

更新 II:我将训练集和测试集结合起来，因此从 1 到 101 的行是训练数据，其余的是测试.我将 (quantregForest) 中提供的示例修改为:

Update II: I combined my training and test sets so that rows from 1 to 101 are the training data and the rest is the testing. I modified the example provided in (quantregForest) as:

data <-  read.table("toy.txt", header = T)
n <- nrow(data)
indextrain <- 1:101
xtrain <- data[indextrain, 3:14]
xtest <- data[-indextrain, 3:14]
ytrain <- data[indextrain, 15]
ytest <- data[-indextrain, 15]

qrf <- quantregForest(x=xtrain, y=ytrain)
quant.newdata <- predict(qrf, newdata= xtest)

它有效！如果有人能解释为什么它以这种方式工作而不是以另一种方式工作，我将不胜感激?

And it works! I'd appreciate if any one could explain why it works this way and not with the other way?

r 随机森林错误 - 新数据中的预测变量类型不匹配 [英] r random forest error - type of predictors in new data do not match

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

r 随机森林错误 - 新数据中的预测变量类型不匹配 [英] r random forest error - type of predictors in new data do not match

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭