data.table:当列名未知时创建条件变量的正确方法? [英] data.table: Proper way to do create a conditional variable when column names are not known?
问题描述
我的问题与创建一个变量有关,该变量依赖于 data.table 中的其他列,而事先不知道任何变量名称.
My question relates to the creation of a variable which depends upon other columns within a data.table when none of the variable names are known in advance.
下面是一个玩具示例,其中我有 5 行,当条件等于 A 时新变量应为 1,否则为 4.
Below is a toy example where I have 5 rows and the new variable should be 1 when the condition is equal to A and 4 elsewise.
library(data.table)
DT <- data.table(Con = c("A","A","B","A","B"),
Eval_A = rep(1,5),
Eval_B = rep(4,5))
Col1 <- "Con"
Col2 <- "Eval_A"
Col3 <- "Eval_B"
Col4 <- "Ans"
下面的代码有效,但感觉就像我在滥用包!
The code below works but feels like I'm misusing the package!
DT[,Col4:=ifelse(DT[[Col1]]=="A",
DT[[Col2]],
DT[[Col3]]),with=FALSE]
更新:谢谢,我对下面的答案做了一些快速的计时.一次在具有 500 万行且只有相关列的 data.table 上,然后在添加 10 个不相关列后,结果如下:
Update: Thanks, I did some quick timing of the answers below. Once on a data.table with 5 million rows and only the relevant columns and again after adding 10 non relevant columns, below are the results:
+-------------------------+---------------------+------------------+
| Method | Only relevant cols. | With extra cols. |
+-------------------------+---------------------+------------------+
| List method | 1.8 | 1.91 |
| Grothendieck - get/if | 26.79 | 30.04 |
| Grothendieck - get/join | 0.48 | 1.56 |
| Grothendieck - .SDCols | 0.38 | 0.79 |
| agstudy - Substitute | 2.03 | 1.9 |
+-------------------------+---------------------+------------------+
看起来像 .SDCols 最适合速度和使用易于阅读的代码的替代品.
Look's like .SDCols is best for speed and using substitute for easy to read code.
推荐答案
1.获取/如果尝试使用 get
:
DT[, (Col4) := if (get(Col1) == "A") get(Col2) else get(Col3), by = 1:nrow(DT)]
<强>2.获取/加入或尝试这种方法:
setkeyv(DT, Col1)
DT[, (Col4):=get(Col3)]["A", (Col4):=get(Col2)]
3..SDCols或者这个:
setkeyv(DT, Col1)
DT[, (Col4):=.SD, .SDcols = Col3]["A", (Col4):=.SD, .SDcols = Col2]
更新:添加了一些额外的方法.
UPDATE: Added some additional approaches.
这篇关于data.table:当列名未知时创建条件变量的正确方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!