寻求从查找表向data.table中添加列的更好方法 [英] Seeking an better way to add columns in data.table from lookup table

查看:45
本文介绍了寻求从查找表向data.table中添加列的更好方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过与常见列 index1 &的查找表合并,在data.table中创建一个新列 key . index2 .然后,从这个新的 key 列( a b c )的值中,生成3个新列( a b c ),它们为data.table中的 value 列编制索引.

I'd like to create a new column key in my data.table by merging with a lookup table by common columns index1 & index2. Then from the values of this new key column (a,b,c), I'd like to generate 3 new columns (a,b,c) that indices the value column in the data.table.

我的data.table看起来像这样:

My data.table looks like this:

    index1      index2    value
1       2          0     0.00
2       1          2    -5.00
3       3          2    -5.00
4       3          2    17.50
5       2          2    15.00
6       1          2    -7.50
7       3          2     3.75
8       1          2    -8.75
9       2          1    15.00
10      2          1    12.50

查找表是这样的:

 index1  index2  key
1    1        1   a
2    1        2   b
3    2        1   a
4    2        2   c
5    3        1   c
6    3        2   b

最终结果是这样的:

    index1     index2   value  key       a       b      c
1       2          0     0.00   NA      NA      NA     NA
2       1          2    -5.00    b      NA   -5.00     NA
3       3          2    -5.00    b      NA   -5.00     NA
4       3          2    17.50    b      NA   17.50     NA
5       2          2    15.00    c      NA      NA  15.00
6       1          2    -7.50    b      NA   -7.50     NA
7       3          2     3.75    b      NA    3.75     NA
8       1          2    -8.75    b      NA   -8.75     NA 
9       2          1    15.00    a   15.00      NA     NA
10      2          1    12.50    a   12.50      NA     NA

我尝试通过先通过 merge()合并data.table和lookup table来解决问题,然后分别使用 J()三次,以实现上述结果.我对data.table还是很陌生,但是我希望学习一种更优雅的方法来解决此问题,而不必重复执行几次该过程.这是我的代码:

I tried solving it by first merging the data.table and the lookup table by merge(), then used J() three separate times to achieve the above result. I'm pretty new to data.table, but would love learn a more elegant way to solve this instead of repeating the procedure several times. Here's my code:

DT <- merge(DT, lookup, by=c('index1', 'index2'), all.x=TRUE)
DT <- data.table(DT)  #Don't know why but DT became a data.frame after merge()
DT[J("a"), a:=value]
DT[J("b"), b:=value]
DT[J("c"), c:=value]

推荐答案

由于您要加入index1&index2,您可以按需使用 merge ,或者如果您将这些分配为每个表的键,则可以简单地使用 [] 进行联接.(与 DT [lookup] 一样)

Since you want to join on index1 & index2, you can use merge as you have or if you assign these as keys to each of your tables, then you can simply use [] to join. (As in DT[lookup] )

 setkey(lookup, index1, index2)
 setkey(DT, index1, index2)

然后,您可以按以下步骤遍历 lookup 表中的每个唯一 key

then you can iterate over each unique key from your lookup table as follows

 keyVals <- unique(lookup[, key])

 for (k in keyVals)
   DT[lookup[key==k], c(k) := value]


结果:

DT
#     index1 index2 value    a     b  c
#  1:      1      2 -5.00   NA -5.00 NA
#  2:      1      2 -7.50   NA -7.50 NA
#  3:      1      2 -8.75   NA -8.75 NA
#  4:      2      0  0.00   NA    NA NA
#  5:      2      1 15.00 15.0    NA NA
#  6:      2      1 12.50 12.5    NA NA
#  7:      2      2 15.00   NA    NA 15
#  8:      3      2 -5.00   NA -5.00 NA
#  9:      3      2 17.50   NA 17.50 NA
# 10:      3      2  3.75   NA  3.75 NA

这篇关于寻求从查找表向data.table中添加列的更好方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆