model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels)中的错误:因子X具有新水平 [英] Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels): factor X has new levels

查看:494
本文介绍了model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels)中的错误:因子X具有新水平的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我进行了逻辑回归:

 EW <- glm(everwrk~age_p + r_maritl, data = NH11, family = "binomial")

此外,我想针对r_maritl的每个级别预测everwrk.

Moreover, I want to predict everwrk for each level of r_maritl.

r_maritl具有以下级别:

levels(NH11$r_maritl)
 "0 Under 14 years" 
 "1 Married - spouse in household" 
 "2 Married - spouse not in household"
 "3 Married - spouse in household unknown" 
 "4 Widowed"                               
 "5 Divorced"                             
 "6 Separated"                             
 "7 Never married"                        
 "8 Living with partner"  
 "9 Unknown marital status"  

所以我做到了:

predEW <- with(NH11,
expand.grid(r_maritl = c( "0 Under 14 years", "1 Married - 
spouse in household", "2 Married - spouse not in household", "3 Married - 
spouse in household unknown", "4 Widowed", "5 Divorced", "6 Separated", "7 
Never married", "8 Living with partner", "9 Unknown marital status"),
age_p = mean(age_p,na.rm = TRUE)))

cbind(predEW, predict(EW, type = "response",
                        se.fit = TRUE, interval = "confidence",
                        newdata = predEW))

问题是我得到以下答复:

The Problem is I get the following response:

model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels):因子r_maritl具有新的级别0未满14岁,已婚 -家庭未知的配偶

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor r_maritl has new levels 0 Under 14 years, Married - spouse in household unknown

样本数据:

str(NH11$age_p)
num [1:33014] 47 18 79 51 43 41 21 20 33 56 ...

str(NH11$everwrk)
Factor w/ 2 levels "2 No","1 Yes": NA NA 2 NA NA NA NA NA 2 2 ...

str(NH11$r_maritl)
Factor w/ 10 levels "0 Under 14 years",..: 6 8 5 7 2 2 8 8 8 2 ...

推荐答案

tl; dr 模型中使用的因素.事后看来,这并不令人惊讶,因为您将无法预测这些水平的响应.就是说,令人惊讶的是,R并没有像自动生成NA值那样对您有好处.您可以通过在构造预测框架时使用levels(droplevels(NH11$r_maritl))或等效地EW$xlevels$r_maritl来解决此问题.

tl;dr it looks like you have some levels in your factor that are not represented in your data, that get dropped from the factors used in the model. In hindsight this isn't terribly surprising, since you won't be able to predict responses for these levels. That said, it's mildly surprising that R doesn't do something nice for you like generate NA values automatically. You can solve this problem by using levels(droplevels(NH11$r_maritl)) in constructing your prediction frame, or equivalently EW$xlevels$r_maritl.

可重现的示例:

maritl_levels <- c( "0 Under 14 years", "1 Married - spouse in household", 
  "2 Married - spouse not in household", "3 Married - spouse in household unknown", 
  "4 Widowed", "5 Divorced", "6 Separated", "7 Never married", "8 Living with partner", 
 "9 Unknown marital status")
set.seed(101)
NH11 <- data.frame(everwrk=rbinom(1000,size=1,prob=0.5),
                 age_p=runif(1000,20,50),
                 r_maritl = sample(maritl_levels,size=1000,replace=TRUE))

让我们缺少一个级别:

NH11 <- subset(NH11,as.numeric(NH11$r_maritl) != 3)

适合模型:

EW <- glm(everwrk~r_maritl+age_p,data=NH11,family=binomial)
predEW <- with(NH11,
  expand.grid(r_maritl=levels(r_maritl),age_p=mean(age_p,na.rm=TRUE)))
predict(EW,newdata=predEW)

成功!

model.frame.default中的错误(术语,newdata,na.action = na.action,xlev = object $ xlevels): 因素r_maritl已婚,婚姻水平提高到2级-配偶不在家庭中

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor r_maritl has new levels 2 Married - spouse not in household

predEW <- with(NH11,
           expand.grid(r_maritl=EW$xlevels$r_maritl,age_p=mean(age_p,na.rm=TRUE)))
predict(EW,newdata=predEW)

这篇关于model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels)中的错误:因子X具有新水平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆