left_join 两个数据框并覆盖 [英] left_join two data frames and overwrite

查看:23
本文介绍了left_join 两个数据框并覆盖的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想合并两个数据帧,其中 df2 覆盖 NA 或出现在 df1 中的任何值.合并数据帧和覆盖值提供了一个data.table 选项,但我想知道是否有办法使用 dplyr 来做到这一点.我已经尝试了所有 _join 选项,但似乎没有一个能做到这一点.有没有办法用 dplyr 做到这一点?

这是一个例子:

df1 <- data.frame(y = c("A", "B", "C", "D"), x1 = c(1,2,NA, 4))df2 <- data.frame(y = c("A", "B", "C"), x1 = c(5, 6, 7))

所需的输出:

 y x11 52 乙 63 C 74 D 4

解决方案

我认为你想要的是保留 df2 的值,只添加 df1 的值df2 中不存在的,这就是 anti_join 的作用:

anti_join 返回 x 中没有匹配值的所有行,只保留 x 中的列."

我的解决方案:

df3 <- anti_join(df1, df2, by = "y") %>% bind_rows(df2)警告信息:1:在 anti_join_impl(x, y, by$x, by$y) 中:加入不同级别的因子,强制为字符向量2:在 rbind_all(x, .id) 中:不等的因子水平:强制转换为字符>df3来源:本地数据框 [4 x 2]y x1(chr) (dbl)1 D 42 一个 53 乙 64 C 7

这一行给出了所需的输出(以不同的顺序),但是,您应该注意警告消息,在处理您的数据集时,请务必将 y 作为字符变量读取.>

I'd like to merge two data frames where df2 overwrites any values that are NA or present in df1. Merge data frames and overwrite values provides a data.table option, but I'd like to know if there is a way to do this with dplyr. I've tried all of the _join options but none seem to do this. Is there a way to do this with dplyr?

Here is an example:

df1 <- data.frame(y = c("A", "B", "C", "D"), x1 = c(1,2,NA, 4)) 
df2 <- data.frame(y = c("A", "B", "C"), x1 = c(5, 6, 7))

Desired output:

  y x1
1 A  5
2 B  6
3 C  7
4 D  4

解决方案

I think what you want is to keep the values of df2 and only add the ones in df1 that are not present in df2 which is what anti_join does:

"anti_join return all rows from x where there are not matching values in y, keeping just columns from x."

My solution:

df3 <- anti_join(df1, df2, by = "y") %>% bind_rows(df2)

Warning messages:
1: In anti_join_impl(x, y, by$x, by$y) :
  joining factors with different levels, coercing to character vector
2: In rbind_all(x, .id) : Unequal factor levels: coercing to character

> df3
Source: local data frame [4 x 2]

      y    x1
  (chr) (dbl)
1     D     4
2     A     5
3     B     6
4     C     7

this line gives the desired output (in a different order) but, you should pay attention to the warning message, when working with your dataset be sure to read y as a character variable.

这篇关于left_join 两个数据框并覆盖的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆