如何使用多列作为键合并两个数据框? [英] How to merge two dataframes using multiple columns as key?

查看:84
本文介绍了如何使用多列作为键合并两个数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有以下数据帧:

DF1 <- data.frame("A" = rep(c("A","B"), 18),
                  "B" = rep(c("C","D","E"), 12),
                  "NUM"= rep(rnorm(36,10,1)),
                  "TEST" = rep(NA,36))

DF2 <- data.frame("A" = rep("A",6),
                  "B" = rep(c("C","D"),6),
                  "VAL" = rep(c(1,3),3))

*注:变量 A DF2 中的 B 应该具有唯一的 VAL

*Note: Each unique combination of variables A and B in DF2 should have a unique VAL.

对于每一行,我想在 TEST NA c $ c>,如果 A列中的值为 DF1 中的 VAL A 匹配,并且列 B B中的值匹配匹配该行。否则,我会将 TEST 保留为 NA

For each row, I would like to replace the NA in TEST with the corresponding value of VAL in DF1 if the values in columns A and A match and the values in columns B and B match for that row. Otherwise, I'd leave TEST as NA. How would I do this without looping through each combination using match?

理想情况下,答案将缩放到两个数据帧,其中有许多列要匹配。

Ideally, an answer would scale to two data frames with many columns to match upon.

推荐答案

正如Akrun在评论中提到的那样,您的查找表(DF2)需要简化为唯一的A / B组合。对于您当前的数据框,这不是问题,但是如果同一组合有多个可能的值,则需要其他规则。从那里开始,解决方案很容易:

As Akrun mentioned in comments, your lookup table (DF2) needs to be reduced to just its unique A/B combinations. For your current dataframe, this isn't a problem, but you will need additional rules if there are multiple possible values for the same combination. From there, the solution is easy:

DF2.u <- unique(DF2)
DF3 <- merge(DF1, DF2.u, all = T)

请注意,这将产生一个新的数据帧空的TEST列(所有值 NA ),以及从DF2分配的VAL列。要完全执行您想要的操作(在可能的情况下用VAL代替TEST),下面是一些较笨拙的代码:

Note that this will produce a new dataframe with an empty TEST column (all values NA), and a VAL column assigned from DF2. To do exactly what you wanted (replace TEST with VAL where possible), here is some slightly clunkier code:

DF1$TEST <- merge(DF1, DF2.u, all = T)$VAL

编辑:响应您的问题,可以根据需要将DF2简化:

in response to your question, you can boil down DF2 if necessary quite simple:

DF2$C <- c(1:12) #now unique() won't work
DF2.u <- unique(DF2[1:3])

 A B VAL
1 A C   1
2 A D   3

这篇关于如何使用多列作为键合并两个数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆