高效的left_join和后续合并 [英] Efficient left_join and subsequent merge
问题描述
我有以下数据:
library(dplyr)
a<-data.frame("one"=c(1:10),
"two"=c("","","","","a","a","a","a","a","a"), stringsAsFactors = F)
b<-data.frame("one"=c(4,2,6),
"two"=c("C","D","A"), stringsAsFactors = F)
我要到 left_join
b
到 a
上,这样 a $ two
每当 a $ one == b $ one 时就得到
。我这样做是这样的: b $ two
的值
I want to left_join
b
onto a
, such that a$two
gets the value of b$two
whenever a$one == b$one
. This I do like this:
left_join(a, b, by="one")
为了具有与以前相同的结构,我们可以执行以下操作
In order to have the same structures as before, we can do the following
left_join(a, b, by="one") %>%
mutate(two=ifelse(is.na(two.y), two.x, two.y)) %>%
select(-c(two.x, two.y))
这给了我想要的输出:
one two
1 1
2 2 D
3 3
4 4 C
5 5 a
6 6 A
7 7 a
8 8 a
9 9 a
10 10 a
是否可以执行 left_join
,这样就不必变异
和 select
来获得所需的输出了吗?即,有没有更有效的方式来获得我想要的东西?现在,同时 mutate
和 select
Is there a way to perform the left_join
such that it isn't necessary to mutate
and select
to obtain the desired output? I.e., is there a more efficient way to get what I want? Right now it seems cumbersome to both mutate
and select
推荐答案
如果我们正在寻找一个紧凑而高效的选项,则可以使用 data.table
来实现。将'a'转换为 data.table
后,在上加入
'one'并赋值(
:=
)'i.two',即从'b'到'two'的列(从'a')
If we are looking for a compact and efficient option, then this can be achieved with data.table
. After converting the 'a' to data.table
, join on
'one' and assign (:=
) the 'i.two' i.e. the column from 'b' to 'two' (from 'a')
library(data.table)
setDT(a)[b,two := i.two , on = .(one)]
a
# one two
# 1: 1
# 2: 2 D
# 3: 3
# 4: 4 C
# 5: 5 a
# 6: 6 A
# 7: 7 a
# 8: 8 a
# 9: 9 a
#10: 10 a
这篇关于高效的left_join和后续合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!