R模型矩阵中因子的所有层次 [英] All Levels of a Factor in a Model Matrix in R

查看:142
本文介绍了R模型矩阵中因子的所有层次的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.frame,由数字和因子变量组成,如下所示.

I have a data.frame consisting of numeric and factor variables as seen below.

testFrame <- data.frame(First=sample(1:10, 20, replace=T),
           Second=sample(1:20, 20, replace=T), Third=sample(1:10, 20, replace=T),
           Fourth=rep(c("Alice","Bob","Charlie","David"), 5),
           Fifth=rep(c("Edward","Frank","Georgia","Hank","Isaac"),4))

我想构建一个matrix,该matrix将虚拟变量分配给该因子,而只保留数字变量.

I want to build out a matrix that assigns dummy variables to the factor and leaves the numeric variables alone.

model.matrix(~ First + Second + Third + Fourth + Fifth, data=testFrame)

如预期的那样,当运行lm时,这会将每个因子的一个水平留为参考水平.但是,我想为所有因素的每个级别构建一个带有虚拟变量/指标变量的matrix.我正在为glmnet建立此矩阵,所以我不必担心多重共线性.

As expected when running lm this leaves out one level of each factor as the reference level. However, I want to build out a matrix with a dummy/indicator variable for every level of all the factors. I am building this matrix for glmnet so I am not worried about multicollinearity.

有没有办法让model.matrix为因子的每个级别创建虚拟对象?

Is there a way to have model.matrix create the dummy for every level of the factor?

推荐答案

您需要为因子变量重置contrasts:

You need to reset the contrasts for the factor variables:

model.matrix(~ Fourth + Fifth, data=testFrame, 
        contrasts.arg=list(Fourth=contrasts(testFrame$Fourth, contrasts=F), 
                Fifth=contrasts(testFrame$Fifth, contrasts=F)))

或者,键入少一点,但没有正确的名称:

or, with a little less typing and without the proper names:

model.matrix(~ Fourth + Fifth, data=testFrame, 
    contrasts.arg=list(Fourth=diag(nlevels(testFrame$Fourth)), 
            Fifth=diag(nlevels(testFrame$Fifth))))

这篇关于R模型矩阵中因子的所有层次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆