R中模型矩阵中因子的所有级别 [英] All Levels of a Factor in a Model Matrix in R

查看:24
本文介绍了R中模型矩阵中因子的所有级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 data.frame 由数字和因子变量组成,如下所示.

I have a data.frame consisting of numeric and factor variables as seen below.

testFrame <- data.frame(First=sample(1:10, 20, replace=T),
           Second=sample(1:20, 20, replace=T), Third=sample(1:10, 20, replace=T),
           Fourth=rep(c("Alice","Bob","Charlie","David"), 5),
           Fifth=rep(c("Edward","Frank","Georgia","Hank","Isaac"),4))

我想构建一个 matrix 将虚拟变量分配给因子并单独留下数值变量.

I want to build out a matrix that assigns dummy variables to the factor and leaves the numeric variables alone.

model.matrix(~ First + Second + Third + Fourth + Fifth, data=testFrame)

正如预期的那样,在运行 lm 时,这将每个因素的一个水平作为参考水平.但是,我想为所有因素的每个级别构建一个带有虚拟/指标变量的 matrix.我正在为 glmnet 构建这个矩阵,所以我不担心多重共线性.

As expected when running lm this leaves out one level of each factor as the reference level. However, I want to build out a matrix with a dummy/indicator variable for every level of all the factors. I am building this matrix for glmnet so I am not worried about multicollinearity.

有没有办法让 model.matrix 为因子的每个级别创建虚拟对象?

Is there a way to have model.matrix create the dummy for every level of the factor?

推荐答案

您需要重新设置因子变量的contrasts:

You need to reset the contrasts for the factor variables:

model.matrix(~ Fourth + Fifth, data=testFrame, 
        contrasts.arg=list(Fourth=contrasts(testFrame$Fourth, contrasts=F), 
                Fifth=contrasts(testFrame$Fifth, contrasts=F)))

或者,少打一点字,没有专有名词:

or, with a little less typing and without the proper names:

model.matrix(~ Fourth + Fifth, data=testFrame, 
    contrasts.arg=list(Fourth=diag(nlevels(testFrame$Fourth)), 
            Fifth=diag(nlevels(testFrame$Fifth))))

这篇关于R中模型矩阵中因子的所有级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆