将所有因子级别的名称作为新列从三列data.table [R] [英] Return all factor levels by name as new columns from a three column data.table [R]

查看:113
本文介绍了将所有因子级别的名称作为新列从三列data.table [R]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  library(data.table)

(DT = data.table(a = LETTERS [c(1,3,8)],b = c(2,4:7),
c = as.factor(c bob,mary,bob,george,alice)),key =a))

返回:

 #abc 
#1:A 2 bob
#2:A 4 mary
#3:B 5 bob
#4:C 6 george
#5:H 7 alice

想要得到这个:

 #alice bob george mary 
#1:A NA 2 NA NA
#2:A NA NA NA 4
#3:B NA 5 NA NA
#4:C NA NA 6 NA
#5:H 7 NA NA NA


解决方案

类似于创建虚拟变量

  uc<  -  sort(unique(as.character(DT $ c)))
DT [,(uc):= lapply(uc,function(x)ifelse(c == x,b,NA))] [,c('b','c'):= NULL]






我听说过有关 ifelse ,所以更快的路线可能是

  uc<  -  sort(unique(as.character(DT $ c)))
是< - 1:nrow(DT)
js< - as.character(DT $ c)
vs < - DT
$ b $对于(i in)set(DT,i = is [i],j = js [i],value = vs [i])
$ [$(uc):= NA_integer_]

$ b DT [,c('b','c'):= NULL]


Any way to use data.table or dplyr to solve the below?

library(data.table)

(DT = data.table(a = LETTERS[c(1, 1:3, 8)], b = c(2, 4:7), 
                 c = as.factor(c("bob", "mary", "bob", "george", "alice")), key="a"))

Returns:

#    a b      c
# 1: A 2    bob
# 2: A 4   mary
# 3: B 5    bob
# 4: C 6 george
# 5: H 7  alice

Would like to get this:

#        alice bob george  mary 
# 1: A    NA   2    NA     NA
# 2: A    NA   NA   NA     4
# 3: B    NA   5    NA     NA
# 4: C    NA   NA   6      NA
# 5: H    7    NA   NA     NA

解决方案

This is similar to creating dummy variables.

uc <- sort(unique(as.character(DT$c)))
DT[,(uc):=lapply(uc,function(x)ifelse(c==x,b,NA))][,c('b','c'):=NULL]


I've heard bad things about ifelse, so a speedier route may be

uc <- sort(unique(as.character(DT$c)))
is <- 1:nrow(DT)
js <- as.character(DT$c)
vs <- DT$b

DT[,(uc):=NA_integer_]
for (i in is) set(DT,i=is[i],j=js[i],value=vs[i])

DT[,c('b','c'):=NULL]

这篇关于将所有因子级别的名称作为新列从三列data.table [R]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆