合并R中具有不同行长度的多个data.frames [英] Merge multiple data.frames in R with varying row length

查看:1326
本文介绍了合并R中具有不同行长度的多个data.frames的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于R来说,我比较新,试图找出如何合并多个不同行数的data.frames,但是它们都有一个共同的列Year。我已经看过类似的问题,而这个问题:
合并数据帧,不同长度
提供了一个很好的答案。但是,当我将其应用于我自己的数据时,我无法使用它来处理多个data.frames;我总是收到一条错误消息。



样本数据:

 > ; df1<  -  data.frame(Year = 2006:2011,Site1 = c(2.3,1,3.1,2.9,1.4,3))
> df2< - data.frame(Year = 2007:2011,Site2 = c(2.7,4.1,1.1,2.6,3.1))
> df3< - data.frame(Year = 2008:2011,Site3 = c(1.3,2,3.6,1.7))

目标是生成一个单独的数据框架,其中列1是年份,列2是站点1,列3是站点2等等。我目前拥有〜17个data.frames(最多40个),对应17个具有可变时间轴/行数的站点。



任何帮助将不胜感激。



代码我试过:

 > NewDF<  -  merge(df1,df2,by =Year,all.x = TRUE,all.y = TRUE)

这对于2个data.frames非常有用,但是当我尝试添加另一个data.frame时,我收到错误消息:

 > NewDF<  -  merge(list = c(df1,df2,df3),by =Year,all.x = TRUE,all.y = TRUE)
as.data.frame(x)中的错误:参数x缺失,没有默认


解决方案

你想要将结果与 df3 合并,即:

  merge(df3 ,merge(df1,df2,by =Year,all.x = TRUE,all.y = TRUE),by =Year,all.x = TRUE,all.y = TRUE)
#年Site3 Site1 Site2
#1 2006< NA> 2.3< NA>
#2 2007< NA> 1 2.7
#3 2008 1.3 3.1 4.1
#4 2009 2 2.9 1.1
#5 2010 3.6 1.4 2.6
#6 2011 1.7 3 3.1

或者如果您的列表中有 data.frame code>减少以推广上述:

  Reduce(function(x,y) merge(x,y,by =Year,all.x = TRUE,all.y = TRUE),
list(df1,df2,df3))
#年份Site1 Site2 Site3
#1 2006 2.3< NA> < NA>
#2 2007 1 2.7< NA>
#3 2008 3.1 4.1 1.3
#4 2009 2.9 1.1 2
#5 2010 1.4 2.6 3.6
#6 2011 3 3.1 1.7


I'm relatively new to R and trying to figure out how to merge multiple data.frames with varying numbers of rows but all with a common column, "Year". I've looked through similar questions, and this question: Merge dataframes, different lengths provided a great answer. However, when I applied it to my own data, I couldn't get it to work with multiple data.frames; I always receive an error message.

Sample data:

> df1 <- data.frame(Year=2006:2011, Site1=c("2.3", "1"  , "3.1", "2.9", "1.4", "3"))  
> df2 <- data.frame(Year=2007:2011, Site2=c("2.7", "4.1", "1.1", "2.6", "3.1"))  
> df3 <- data.frame(Year=2008:2011, Site3=c("1.3", "2"  , "3.6", "1.7"))  

The goal is to produce a single data.frame where column 1 is the year, column 2 is site 1, column 3 is site 2, and so on. I have ~17 data.frames currently (there will be up to 40), corresponding to 17 sites with variable timelines/number of rows.

Any help would be appreciated.

Code I've tried:

> NewDF <- merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE)  

This worked great for 2 data.frames, but when I tried to add in another data.frame, I received the error message:

> NewDF <- merge(list=c(df1, df2, df3), by="Year", all.x=TRUE, all.y=TRUE)  
 Error in as.data.frame(x) : argument "x" is missing, with no default

解决方案

You want to merge the result with df3, i.e.:

merge(df3, merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE), by = "Year", all.x = TRUE, all.y = TRUE)
#  Year Site3 Site1 Site2
#1 2006  <NA>   2.3  <NA>
#2 2007  <NA>     1   2.7
#3 2008   1.3   3.1   4.1
#4 2009     2   2.9   1.1
#5 2010   3.6   1.4   2.6
#6 2011   1.7     3   3.1

Or if you have your data.frame's in a list, use Reduce to generalize the above:

Reduce(function(x,y) merge(x, y, by = "Year", all.x = TRUE, all.y = TRUE),
       list(df1, df2, df3))
#  Year Site1 Site2 Site3
#1 2006   2.3  <NA>  <NA>
#2 2007     1   2.7  <NA>
#3 2008   3.1   4.1   1.3
#4 2009   2.9   1.1     2
#5 2010   1.4   2.6   3.6
#6 2011     3   3.1   1.7

这篇关于合并R中具有不同行长度的多个data.frames的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆