错误 - lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= 等中的错误 [英] Error - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= etc

查看:125
本文介绍了错误 - lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= 等中的错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Caret 中使用 glmnet 时出错

Getting an error when using glmnet in Caret

下面的例子加载库

library(dplyr)
library(caret)
library(C50)

从库 C50 加载流失数据集

Load churn data set from library C50

data(churn)

创建 x 和 y 变量

create x and y variables

churn_x <- subset(churnTest, select= -churn)   
churn_y <- churnTest[[20]]

使用 createFolds() 在目标变量 churn_y 上创建 5 个 CV 折叠

Use createFolds() to create 5 CV folds on churn_y, the target variable

 myFolds <- createFolds(churn_y, k = 5)

创建 trainControl 对象:myControl

Create trainControl object: myControl

myControl <- trainControl(
 summaryFunction = twoClassSummary,
 classProbs = TRUE, # IMPORTANT!
 verboseIter = TRUE,
 savePredictions = TRUE,
 index = myFolds
)

拟合 glmnet 模型:model_glmnet

Fit glmnet model: model_glmnet

model_glmnet <- train(
  x = churn_x, y = churn_y,
  metric = "ROC",
  method = "glmnet",
  trControl = myControl
)

我收到以下错误

lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, 中的错误:外部函数调用中的 NA/NaN/Inf (arg 5)另外: 警告信息:在 lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, :强制引入的 NA

Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5) In addition: Warning message: In lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NAs introduced by coercion

我已经检查过 churn_x 变量中没有缺失值

I have checked and there are no missing values in the churn_x variables

sum(is.na(churn_x))

有人知道答案吗?

推荐答案

问题出在模型规范上.如果您使用插入符训练公式界面,训练将起作用:

The problem is in the model specification. If you use the caret train formula interface the training will work:

train <- data.frame(churn_x, churn_y)

model_glmnet <- train(churn_y ~ ., data = train,
  metric = "ROC",
  method = "glmnet",
  trControl = myControl
)

> model_glmnet$results
  alpha       lambda       ROC      Sens      Spec      ROCSD     SensSD      SpecSD
1  0.10 0.0001754386 0.6958156 0.2845934 0.9123349 0.01855530 0.01616471 0.004002873
2  0.10 0.0017543858 0.7187303 0.2901986 0.9185721 0.01681286 0.01415863 0.005347573
3  0.10 0.0175438576 0.7399174 0.2355121 0.9487161 0.01482812 0.03932741 0.010769455
4  0.55 0.0001754386 0.6988285 0.2901800 0.9121614 0.01907845 0.01312159 0.004200233
5  0.55 0.0017543858 0.7260286 0.2946617 0.9185714 0.01761485 0.02171189 0.006755247
6  0.55 0.0175438576 0.7630039 0.2008939 0.9617103 0.01743847 0.03989938 0.006118592
7  1.00 0.0001754386 0.7009482 0.2924146 0.9119881 0.01958200 0.01233419 0.004157393
8  1.00 0.0017543858 0.7313495 0.2957728 0.9203040 0.01797853 0.02356945 0.008478577
9  1.00 0.0175438576 0.7672690 0.1595779 0.9760892 0.01935176 0.01935583 0.007938801

但是,当您指定 xy 时,它将不起作用,因为 glmnet 以模型矩阵的形式获取 x,当您提供插入符号的公式它将负责 model.matrix 创建,但如果您只指定 xy 那么它会假设 x 是一个 model.matrix 并将其传递给 glmnet.例如,这有效:

However when you specify x and y it will not work because glmnet takes the x in the form of a model matrix, When you supply the formula to caret it will take care of model.matrix creation but if you just specify the x and y then it will assume x is a model.matrix and will pass it to glmnet. For instance this works:

x <- model.matrix(churn_y ~ ., data = train)

model_glmnet2 <- train(x = x, y = churn_y,
                      metric = "ROC",
                      method = "glmnet",
                      trControl = myControl
)
> model_glmnet2$results
  alpha       lambda       ROC      Sens      Spec      ROCSD     SensSD      SpecSD
1  0.10 0.0001754386 0.6958156 0.2845934 0.9123349 0.01855530 0.01616471 0.004002873
2  0.10 0.0017543858 0.7187303 0.2901986 0.9185721 0.01681286 0.01415863 0.005347573
3  0.10 0.0175438576 0.7399174 0.2355121 0.9487161 0.01482812 0.03932741 0.010769455
4  0.55 0.0001754386 0.6988285 0.2901800 0.9121614 0.01907845 0.01312159 0.004200233
5  0.55 0.0017543858 0.7260286 0.2946617 0.9185714 0.01761485 0.02171189 0.006755247
6  0.55 0.0175438576 0.7630039 0.2008939 0.9617103 0.01743847 0.03989938 0.006118592
7  1.00 0.0001754386 0.7009482 0.2924146 0.9119881 0.01958200 0.01233419 0.004157393
8  1.00 0.0017543858 0.7313495 0.2957728 0.9203040 0.01797853 0.02356945 0.008478577
9  1.00 0.0175438576 0.7672690 0.1595779 0.9760892 0.01935176 0.01935583 0.007938801

model.matrix 只有在有因子特征时才需要

model.matrix is needed only when there are factor features

这篇关于错误 - lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= 等中的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆