使用data.table包重新编码变量 [英] Recode a variable using data.table package
问题描述
如果我想使用 data.table
在R中重新编码变量,语法是什么?我看到了一些ans,但没有找到合适的答案。
If I want to recode a variable in R using data.table
, what is the syntax? I saw some ans but didn't find them appropriate.
例如如果我有一个名为 gender
e.g. if I have the variable called gender
的变量,我想将性别0重新编码为未知,1为男性,2为女:
这是我的尝试方式:
I want to recode gender 0 to unknown, 1 to male, 2 to female: here is how I tried:
Name <- c("John", "Tina", "Dave", "Casper")
Gender <- c(1, 2, 2, 0)
trips <- cbind.data.frame(Name, Gender)
trips[, gender = ifelse(gender == 0, "Unkown", gender == 1, "Male", gender == 2, "Female" )]
但我收到错误消息
推荐答案
一旦data.table,那么使用向量化翻译策略将是最有效的。 match
函数提供了一种创建选择向量的方法,用于从一组字符可能性中选择一项:
Once you have a data.table then it would be most efficient to use a vectorized translation strategy. The match
function provides a method of creating a "selection vector" for a choosing a item from a set of character possibilities:
library(data.table)
setDT(trips) # create a data.table from a dataframe
trips[ , Gender := c("Unknown", "male", "Female")[match(Gender, c(0,1,2))] ]
#-------------------
> trips
Name Gender
1: John male
2: Tina Female
3: Dave Female
4: Casper Unknown
对于这种特定情况,可以使用更简单的解决方案(将ht改为 @ Chinsoon ):
For this specific case, a simpler solution could be (ht to @Chinsoon):
trips[, gender := c("Unknown", "Male", "Female")[gender + 1L] ]
这篇关于使用data.table包重新编码变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!