Caret 的 train.recipe 似乎没有应用配方程序来删除 NA,随后交叉验证失败 [英] Caret's train.recipe seems to not apply recipe procedure to remove NAs and subsequently cross-validation fails

查看:107
本文介绍了Caret 的 train.recipe 似乎没有应用配方程序来删除 NA,随后交叉验证失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

插入符号包似乎没有应用配方程序来删除 NA 以进行交叉验证.我想我忽略了一些东西...

iris_dt <- as.data.table(iris)iris_dt[3:5,':='(Petal.Length=NA)]control <- trainControl(method='cv',number=2,allowParallel = T)rec <- recipe(Petal.Length ~ Sepal.Width,iris_dt) %>% step_naomit(all_outcomes(),all_predictors())火车(rec,iris_dt,method='lm',trControl = control)

<块引用>

quantile.default(y, probs = seq(0, 1, length = cut)) 中的错误:如果 'na.rm' 为 FALSE,则不允许使用缺失值和 NaN

当回归量是 NA 时它也不起作用,但会给出不同的错误消息.当数据准备好并烘焙并传递到 train(.) 的 x/y 接口时,它就可以工作了.

非常感谢您的任何提示.

解决方案

配方工作正常,但重采样是在使用配方之前创建的.您应该在调用 train 之前删除它们或使用公式方法

<代码>>iris_dt <- as.data.table(iris)>iris_dt[3:5,':='(Petal.Length=NA)]>control <- trainControl(method='cv',number=2,allowParallel = T)>rec <- recipe(Petal.Length ~ Sepal.Width,iris_dt) %>% step_naomit(all_outcomes(),all_predictors())>train(Petal.Length ~ Sepal.Width,iris_dt,method='lm',trControl = control, na.action = na.omit)线性回归150个样品1 个预测器无需预处理重采样:交叉验证(2 折)样本量汇总:74、73重采样结果:RMSE R 平方 MAE1.610659 0.1885815 1.363651调整参数拦截"保持不变,值为 TRUE

The caret package seems to not apply the recipe procedure to remove NAs for cross-validation. I guess that I overlook something...

iris_dt <- as.data.table(iris)
iris_dt[3:5,':='(Petal.Length=NA)]
control <- trainControl(method='cv',number=2,allowParallel = T)
rec <- recipe(Petal.Length ~ Sepal.Width,iris_dt) %>% step_naomit(all_outcomes(),all_predictors())
train(rec,iris_dt,method='lm',trControl = control)

Error in quantile.default(y, probs = seq(0, 1, length = cuts)) : missing values and NaN's not allowed if 'na.rm' is FALSE

It does also not work when the regressor is NA but gives a different error message. When data is prepared and baked and passed to the x/y interface of train(.) it works.

Many thanks for any hints.

解决方案

The recipe works fine but the resamples are create prior to the recipe being used. You should remove them prior to calling train or use the formula method

> iris_dt <- as.data.table(iris)
> iris_dt[3:5,':='(Petal.Length=NA)]
> control <- trainControl(method='cv',number=2,allowParallel = T)
> rec <- recipe(Petal.Length ~ Sepal.Width,iris_dt) %>% step_naomit(all_outcomes(),all_predictors())
> train(Petal.Length ~ Sepal.Width,iris_dt,method='lm',trControl = control, na.action = na.omit)
Linear Regression 

150 samples
  1 predictor

No pre-processing
Resampling: Cross-Validated (2 fold) 
Summary of sample sizes: 74, 73 
Resampling results:

  RMSE      Rsquared   MAE     
  1.610659  0.1885815  1.363651

Tuning parameter 'intercept' was held constant at a value of TRUE

这篇关于Caret 的 train.recipe 似乎没有应用配方程序来删除 NA,随后交叉验证失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆