赋值与data.table联接操作,多次匹配 [英] Assignment with data.table join operation, multiple matches
问题描述
我有两个data.tables:
I have two data.tables:
dt1 = data.table(a=c('a','b'))
dt2 = data.table(a=c('a','b','b'))
合并 dt1 [dt2,on ='a']
提供
a
1: a
2: b
3: b
所以当我执行操作 dt1 [dt2,on ='a',c:= 1]
b
$ b
so when I perform the operation dt1[dt2, on='a', c:= 1]
I expect
a c
1: a 1
2: b 1
3: b 1
但我得到
a c
1: a 1
2: b 1
?
推荐答案
我们需要使用
dt2[dt1, c := 1, on = "a"]
dt2
# a c
#1: a 1
#2: b 1
#3: b 1
如果我们不想更改初始数据集dt1
If we are not interested in changing the initial dataset 'dt1', then
dt1[dt2, c(.SD, c= 1), on = 'a']
# a c
#1: a 1
#2: b 1
#3: b 1
OP的方法的问题是在连接之后,赋值(:=
)在第一个数据集('dt1')中,它只有2行,因此,分配的值也将在这2行而不是3行。一个选项是将其分配给第二个数据集(如第一个方法中所示),或者通过连接新列c创建一个新数据集。
The problem in the OP's approach is that after the join, the assignment (:=
) happens in the first dataset ('dt1') and it has only 2 rows, so, the assigned values will be also be in those 2 rows instead of 3 rows. One option is to assign it to the second dataset (as showed in the first method) or we create a new dataset by concatenating a new column 'c'.
这篇关于赋值与data.table联接操作,多次匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!