将因子列转换为多个布尔列 [英] Convert a factor column to multiple boolean columns
本文介绍了将因子列转换为多个布尔列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
给定的数据如下:
library(data.table)
DT <- data.table(x=rep(1:5, 2))
我可以这样做:
new.names <- sort(unique(DT$x))
DT[, paste0('col', new.names) := lapply(new.names, function(i) DT$x==i), with=FALSE]
lapply
这可能比data.table替代方案慢,这个解决方案打击了我不是很data.table-ish。
But this uses a pesky lapply
which is probably slower than the data.table alternative and this solutions strikes me as not very "data.table-ish".
有更好和/或更快的方法来创建这些新列吗?
Is there a better and/or faster way to create these new columns?
推荐答案
model.matrix
?
model.matrix(~factor(x)-1,data=DT)
factor(x)1 factor(x)2 factor(x)3 factor(x)4 factor(x)5
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
6 1 0 0 0 0
7 0 1 0 0 0
8 0 0 1 0 0
9 0 0 0 1 0
10 0 0 0 0 1
attr(,"assign")
[1] 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$`factor(x)`
[1] "contr.treatment"
显然,您可以将 model.matrix
into [。data.table
给出相同的结果。不确定是否会更快:
Apparently, you can put model.matrix
into [.data.table
to give the same results. Not sure if it would be faster:
DT[,model.matrix(~factor(x)-1)]
这篇关于将因子列转换为多个布尔列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文