使用data.table重新编码变量 [英] Recode a variable using data.table

查看:74
本文介绍了使用data.table重新编码变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用data.table重新编码一个变量。我已经用Google搜索了将近2个小时,但是找不到答案。

I am trying to recode a variable using data.table. I have googled for almost 2 hours but couldn't find an answer.

假设我有一个data.table,如下所示:

Assume I have a data.table as the following:

DT <- data.table(V1=c(0L,1L,2L),
                 V2=LETTERS[1:3],
                 V4=1:12)

我想重新编码V1和V2。对于V1,我想将1s编码为0,将2s编码为1。
对于V2,我想将A编码为T,将B编码为K,将C编码为D。

I want to recode V1 and V2. For V1, I want to recode 1s to 0 and 2s to 1. For V2, I want to recode A to T, B to K, C to D.

如果我使用 dplyr ,这很简单。

If I use dplyr, it is simple.

library(dplyr)
DT %>% 
  mutate(V1 = recode(V1, `1` = 0L, `2` = 1L)) %>% 
  mutate(V2 = recode(V2, A = "T", B = "K", C = "D"))

但是我不知道如何在data.table中执行此操作

But I have no idea how to do this in data.table

DT[V1==1, V1 := 0]
DT[V1==2, V1 := 1]
DT[V2=="A", V2 := "T"]
DT[V2=="B", V2 := "K"]
DT[V2=="C", V2 := "D"]

上面是我认为最好的代码。但是必须有一种更好,更有效的方法。

Above is the code that I can think as my best. But there must be a better and a more efficient way to do this.

编辑

我更改了重新编码V2的方式,以使示例更通用。

I changed how I want to recode V2 to make my example more general.

推荐答案

我认为这可能是您想要的。在:= 的左侧,我们命名要更新的变量,在右侧,我们具有要更新其对应变量的表达式。

I think this might be what you're looking for. On the left hand side of := we name the variables we want to update and on the right hand side we have the expressions we want to update the corresponding variables with.

DT[, c("V1","V2") := .(as.numeric(V1==2), sapply(V2, function(x) {if(x=="A") "T" 
                                                     else if (x=="B") "K" 
                                                     else if (x=="C") "D" }))]

 #   V1 V2 V4
 #1:  0  T  1
 #2:  0  K  2
 #3:  1  D  3
 #4:  0  T  4
 #5:  0  K  5
 #6:  1  D  6
 #7:  0  T  7
 #8:  0  K  8
 #9:  1  D  9
#10:  0  T 10
#11:  0  K 11
#12:  1  D 12

或者,只需在 data.table recode c $ c>:

Alternatively, just use recode within data.table:

library(dplyr)
DT[, c("V1","V2") := .(as.numeric(V1==2), recode(V2, "A" = "T", "B" = "K", "C" = "D"))]

这篇关于使用data.table重新编码变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆