如何在 R 中创建虚拟变量? [英] How do I make a dummy variable in R?
问题描述
因此,我的数据集由 15 个变量组成,其中一个(性别)只有 2 个级别.我想将其用作虚拟变量,但级别为 1 和 2.我该怎么做?我想要 0 级和 1 级,但我不知道如何在 R 中管理它!
So, my data set consists of 15 variables, one of them (sex) has only 2 levels. I want to use it as a dummy variable, but the levels are 1 and 2. How do I do this? I want to have levels 0 and 1, but I don't know how to manage this in R!
推荐答案
使用大多数带有公式界面的 R 建模工具,您无需创建虚拟变量,处理和解释公式的底层代码将执行此操作你.如果您出于其他原因想要一个虚拟变量,那么有几种选择.最简单的(恕我直言)是使用 model.matrix()
:
With most of R's modelling tools with a formula interface you don't need to create dummy variables, the underlying code that handles and interprets the formula will do this for you. If you want a dummy variable for some other reason then there are several options. The easiest (IMHO) is to use model.matrix()
:
set.seed(1)
dat <- data.frame(sex = sample(c("male","female"), 10, replace = TRUE))
model.matrix( ~ sex - 1, data = dat)
给出:
> dummy <- model.matrix( ~ sex - 1, data = dat)
> dummy
sexfemale sexmale
1 0 1
2 0 1
3 1 0
4 1 0
5 0 1
6 1 0
7 1 0
8 1 0
9 1 0
10 0 1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"
> dummy[,1]
1 2 3 4 5 6 7 8 9 10
0 0 1 1 0 1 1 1 1 0
您可以将dummy
的任一列用作数字虚拟变量;选择您想要作为基于 1
级别的任何列.dummy[,1]
选择1
代表女班,dummy[,2]
代表男班.
You can use either column of dummy
as a numeric dummy variable; choose whichever column you want to be the 1
-based level. dummy[,1]
chooses 1
as representing the female class and dummy[,2]
the male class.
如果您希望将其解释为分类对象,请将其作为因子进行转换:
Cast this as a factor if you want it to be interpreted as a categorical object:
> factor(dummy[, 1])
1 2 3 4 5 6 7 8 9 10
0 0 1 1 0 1 1 1 1 0
Levels: 0 1
但这就是打败了factor的对象;0
又是什么?
But that is defeating the object of factor; what is 0
again?
这篇关于如何在 R 中创建虚拟变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!