高效的left_join和后续合并 [英] Efficient left_join and subsequent merge

查看:131
本文介绍了高效的left_join和后续合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据:

library(dplyr)

a<-data.frame("one"=c(1:10),
              "two"=c("","","","","a","a","a","a","a","a"), stringsAsFactors = F)

b<-data.frame("one"=c(4,2,6),
              "two"=c("C","D","A"), stringsAsFactors = F)

我要到 left_join b a 上,这样 a $ two 每当 a $ one == b $ one 时就得到 b $ two 的值。我这样做是这样的:

I want to left_join b onto a, such that a$two gets the value of b$two whenever a$one == b$one. This I do like this:

left_join(a, b, by="one")

为了具有与以前相同的结构,我们可以执行以下操作

In order to have the same structures as before, we can do the following

left_join(a, b, by="one") %>% 
  mutate(two=ifelse(is.na(two.y), two.x, two.y)) %>% 
  select(-c(two.x, two.y))

这给了我想要的输出:

   one two
1    1    
2    2   D
3    3    
4    4   C
5    5   a
6    6   A
7    7   a
8    8   a
9    9   a
10  10   a

是否可以执行 left_join ,这样就不必变异 select 来获得所需的输出了吗?即,有没有更有效的方式来获得我想要的东西?现在,同时 mutate select

Is there a way to perform the left_join such that it isn't necessary to mutate and select to obtain the desired output? I.e., is there a more efficient way to get what I want? Right now it seems cumbersome to both mutate and select

推荐答案

如果我们正在寻找一个紧凑而高效的选项,则可以使用 data.table 来实现。将'a'转换为 data.table 后,在上加入'one'并赋值(:= )'i.two',即从'b'到'two'的列(从'a')

If we are looking for a compact and efficient option, then this can be achieved with data.table. After converting the 'a' to data.table, join on 'one' and assign (:=) the 'i.two' i.e. the column from 'b' to 'two' (from 'a')

library(data.table)
setDT(a)[b,two := i.two , on = .(one)]
a
#     one two
# 1:   1    
# 2:   2   D
# 3:   3    
# 4:   4   C
# 5:   5   a
# 6:   6   A
# 7:   7   a
# 8:   8   a
# 9:   9   a
#10:  10   a

这篇关于高效的left_join和后续合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆