如何合并具有相同列名的多个数据框? [英] How can I merge multiple dataframes with the same column names?

查看:135
本文介绍了如何合并具有相同列名的多个数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个主"数据框,其中包含以下各列:

I have a "master" dataframe that has the following columns:

userid, condition

由于有四个实验条件,因此我也有四个带有答案信息的数据框,其中包括以下几列:

Since there are four experiment conditions, I also have four dataframes that carry answer information, with the following columns:

userid, condition, answer1, answer2

现在,我想加入这些,因此用户ID,条件及其对这些条件的答案的所有组合都将合并.每种情况下,每行只应在相应的列中有正确的答案.

Now, I'd like to join these, so all combinations of user IDs, conditions and their answers to these conditions are merged. Each condition should only have the correct answer in the appropriate column, per row.

master = data.frame(userid=c("foo","foo","foo","foo","bar","bar","bar","bar"), condition=c("A","B","C","D","A","B","C","D"))
cond_a = data.frame(userid=c("foo","bar"), condition="A", answer1=c("1","1"), answer2=c("2","2"))
cond_b = data.frame(userid=c("foo","bar"), condition="B", answer1=c("3","3"), answer2=c("4","4"))
cond_c = data.frame(userid=c("foo","bar"), condition="C", answer1=c("5","5"), answer2=c("6","6"))
cond_d = data.frame(userid=c("foo","bar"), condition="D", answer1=c("7","7"), answer2=c("8","8"))

如何将所有条件合并到主数据库中,所以主数据库表如下所示?

How do I merge all conditions into the master, so the master table looks like follows?

  userid condition answer1 answer2
1    bar         A       1       2
2    bar         B       3       4
3    bar         C       5       6
4    bar         D       7       8
5    foo         A       1       2
6    foo         B       3       4
7    foo         C       5       6
8    foo         D       7       8

我尝试了以下操作:

temp = merge(master, cond_a, all.x=TRUE)

哪个给我:

  userid condition answer1 answer2
1    bar         A       1       2
2    bar         B    <NA>    <NA>
3    bar         C    <NA>    <NA>
4    bar         D    <NA>    <NA>
5    foo         A       1       2
6    foo         B    <NA>    <NA>
7    foo         C    <NA>    <NA>
8    foo         D    <NA>    <NA>

但是,一旦我这样做……

But as soon as I do this…

merge(temp, cond_b, all.x=TRUE)

没有条件B的值.怎么会来?

There are no values for condition B. How come?

  userid condition answer1 answer2
1    bar         A       1       2
2    bar         B    <NA>    <NA>
3    bar         C    <NA>    <NA>
4    bar         D    <NA>    <NA>
5    foo         A       1       2
6    foo         B    <NA>    <NA>
7    foo         C    <NA>    <NA>
8    foo         D    <NA>    <NA>

推荐答案

您可以按以下方式使用Reduce()complete.cases():

You can use Reduce() and complete.cases() as follows:

merged <- Reduce(function(x, y) merge(x, y, all=TRUE), 
                 list(master, cond_a, cond_b, cond_c, cond_d))
merged[complete.cases(merged), ]
#    userid condition answer1 answer2
# 1     bar         A       1       2
# 2     bar         B       3       4
# 4     bar         C       5       6
# 6     bar         D       7       8
# 8     foo         A       1       2
# 9     foo         B       3       4
# 11    foo         C       5       6
# 13    foo         D       7       8

Reduce()可能需要一些习惯.您定义函数,然后提供对象的list以重复地将函数应用到该对象.因此,该声明就像:

Reduce() might take some getting accustomed to. You define your function, and then provide a list of objects to repeatedly apply the function to. Thus, that statement is like doing:

temp1 <- merge(master, cond_a, all=TRUE)
temp2 <- merge(temp1, cond_b, all=TRUE)
temp3 <- merge(temp2, ....)

或类似的东西

merge(merge(merge(master, cond_a, all=TRUE), cond_b, all=TRUE), cond_c, all=TRUE)

complete.cases()创建一个逻辑向量,用于确定指定的列是否完整";该逻辑向量可用于从合并后的data.frame子集中.

complete.cases() creates a logical vector of whether the specified columns are "complete" or not; this logical vector can be used to subset from the merged data.frame.

这篇关于如何合并具有相同列名的多个数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆