将因子列转换为多个布尔列 [英] Convert a factor column to multiple boolean columns

查看:102
本文介绍了将因子列转换为多个布尔列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定的数据如下:

library(data.table)
DT <- data.table(x=rep(1:5, 2))

我可以这样做:

new.names <- sort(unique(DT$x))

DT[, paste0('col', new.names) := lapply(new.names, function(i) DT$x==i), with=FALSE]

lapply 这可能比data.table替代方案慢,这个解决方案打击了我不是很data.table-ish。

But this uses a pesky lapply which is probably slower than the data.table alternative and this solutions strikes me as not very "data.table-ish".

有更好和/或更快的方法来创建这些新列吗?

Is there a better and/or faster way to create these new columns?

推荐答案

model.matrix

model.matrix(~factor(x)-1,data=DT)

   factor(x)1 factor(x)2 factor(x)3 factor(x)4 factor(x)5
1           1          0          0          0          0
2           0          1          0          0          0
3           0          0          1          0          0
4           0          0          0          1          0
5           0          0          0          0          1
6           1          0          0          0          0
7           0          1          0          0          0
8           0          0          1          0          0
9           0          0          0          1          0
10          0          0          0          0          1
attr(,"assign")
[1] 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$`factor(x)`
[1] "contr.treatment"

显然,您可以将 model.matrix into [。data.table 给出相同的结果。不确定是否会更快:

Apparently, you can put model.matrix into [.data.table to give the same results. Not sure if it would be faster:

DT[,model.matrix(~factor(x)-1)]

这篇关于将因子列转换为多个布尔列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆