使用data.table重新编码变量 [英] Recode a variable using data.table
问题描述
我正在尝试使用data.table重新编码一个变量。我已经用Google搜索了将近2个小时,但是找不到答案。
I am trying to recode a variable using data.table. I have googled for almost 2 hours but couldn't find an answer.
假设我有一个data.table,如下所示:
Assume I have a data.table as the following:
DT <- data.table(V1=c(0L,1L,2L),
V2=LETTERS[1:3],
V4=1:12)
我想重新编码V1和V2。对于V1,我想将1s编码为0,将2s编码为1。
对于V2,我想将A编码为T,将B编码为K,将C编码为D。
I want to recode V1 and V2. For V1, I want to recode 1s to 0 and 2s to 1. For V2, I want to recode A to T, B to K, C to D.
如果我使用 dplyr
,这很简单。
If I use dplyr
, it is simple.
library(dplyr)
DT %>%
mutate(V1 = recode(V1, `1` = 0L, `2` = 1L)) %>%
mutate(V2 = recode(V2, A = "T", B = "K", C = "D"))
但是我不知道如何在data.table中执行此操作
But I have no idea how to do this in data.table
DT[V1==1, V1 := 0]
DT[V1==2, V1 := 1]
DT[V2=="A", V2 := "T"]
DT[V2=="B", V2 := "K"]
DT[V2=="C", V2 := "D"]
上面是我认为最好的代码。但是必须有一种更好,更有效的方法。
Above is the code that I can think as my best. But there must be a better and a more efficient way to do this.
编辑
我更改了重新编码V2的方式,以使示例更通用。
I changed how I want to recode V2 to make my example more general.
推荐答案
我认为这可能是您想要的。在:=
的左侧,我们命名要更新的变量,在右侧,我们具有要更新其对应变量的表达式。
I think this might be what you're looking for. On the left hand side of :=
we name the variables we want to update and on the right hand side we have the expressions we want to update the corresponding variables with.
DT[, c("V1","V2") := .(as.numeric(V1==2), sapply(V2, function(x) {if(x=="A") "T"
else if (x=="B") "K"
else if (x=="C") "D" }))]
# V1 V2 V4
#1: 0 T 1
#2: 0 K 2
#3: 1 D 3
#4: 0 T 4
#5: 0 K 5
#6: 1 D 6
#7: 0 T 7
#8: 0 K 8
#9: 1 D 9
#10: 0 T 10
#11: 0 K 11
#12: 1 D 12
或者,只需在 data.table $中使用
recode
c $ c>:
Alternatively, just use recode
within data.table
:
library(dplyr)
DT[, c("V1","V2") := .(as.numeric(V1==2), recode(V2, "A" = "T", "B" = "K", "C" = "D"))]
这篇关于使用data.table重新编码变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!