model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels)中的错误:因子X具有新水平 [英] Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels): factor X has new levels
问题描述
我进行了逻辑回归:
EW <- glm(everwrk~age_p + r_maritl, data = NH11, family = "binomial")
此外,我想针对r_maritl
的每个级别预测everwrk
.
Moreover, I want to predict everwrk
for each level of r_maritl
.
r_maritl
具有以下级别:
levels(NH11$r_maritl)
"0 Under 14 years"
"1 Married - spouse in household"
"2 Married - spouse not in household"
"3 Married - spouse in household unknown"
"4 Widowed"
"5 Divorced"
"6 Separated"
"7 Never married"
"8 Living with partner"
"9 Unknown marital status"
所以我做到了:
predEW <- with(NH11,
expand.grid(r_maritl = c( "0 Under 14 years", "1 Married -
spouse in household", "2 Married - spouse not in household", "3 Married -
spouse in household unknown", "4 Widowed", "5 Divorced", "6 Separated", "7
Never married", "8 Living with partner", "9 Unknown marital status"),
age_p = mean(age_p,na.rm = TRUE)))
cbind(predEW, predict(EW, type = "response",
se.fit = TRUE, interval = "confidence",
newdata = predEW))
问题是我得到以下答复:
The Problem is I get the following response:
model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels):因子r_maritl具有新的级别0未满14岁,已婚 -家庭未知的配偶
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor r_maritl has new levels 0 Under 14 years, Married - spouse in household unknown
样本数据:
str(NH11$age_p)
num [1:33014] 47 18 79 51 43 41 21 20 33 56 ...
str(NH11$everwrk)
Factor w/ 2 levels "2 No","1 Yes": NA NA 2 NA NA NA NA NA 2 2 ...
str(NH11$r_maritl)
Factor w/ 10 levels "0 Under 14 years",..: 6 8 5 7 2 2 8 8 8 2 ...
推荐答案
tl; dr 模型中使用的因素.事后看来,这并不令人惊讶,因为您将无法预测这些水平的响应.就是说,令人惊讶的是,R并没有像自动生成NA
值那样对您有好处.您可以通过在构造预测框架时使用levels(droplevels(NH11$r_maritl))
或等效地EW$xlevels$r_maritl
来解决此问题.
tl;dr it looks like you have some levels in your factor that are not represented in your data, that get dropped from the factors used in the model. In hindsight this isn't terribly surprising, since you won't be able to predict responses for these levels. That said, it's mildly surprising that R doesn't do something nice for you like generate NA
values automatically. You can solve this problem by using levels(droplevels(NH11$r_maritl))
in constructing your prediction frame, or equivalently EW$xlevels$r_maritl
.
可重现的示例:
maritl_levels <- c( "0 Under 14 years", "1 Married - spouse in household",
"2 Married - spouse not in household", "3 Married - spouse in household unknown",
"4 Widowed", "5 Divorced", "6 Separated", "7 Never married", "8 Living with partner",
"9 Unknown marital status")
set.seed(101)
NH11 <- data.frame(everwrk=rbinom(1000,size=1,prob=0.5),
age_p=runif(1000,20,50),
r_maritl = sample(maritl_levels,size=1000,replace=TRUE))
让我们缺少一个级别:
NH11 <- subset(NH11,as.numeric(NH11$r_maritl) != 3)
适合模型:
EW <- glm(everwrk~r_maritl+age_p,data=NH11,family=binomial)
predEW <- with(NH11,
expand.grid(r_maritl=levels(r_maritl),age_p=mean(age_p,na.rm=TRUE)))
predict(EW,newdata=predEW)
成功!
model.frame.default中的错误(术语,newdata,na.action = na.action,xlev = object $ xlevels): 因素r_maritl已婚,婚姻水平提高到2级-配偶不在家庭中
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : factor r_maritl has new levels 2 Married - spouse not in household
predEW <- with(NH11,
expand.grid(r_maritl=EW$xlevels$r_maritl,age_p=mean(age_p,na.rm=TRUE)))
predict(EW,newdata=predEW)
这篇关于model.frame.default(terms,newdata,na.action = na.action,xlev = object $ xlevels)中的错误:因子X具有新水平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!