在列表中合并多个不同行数的文件 [英] Merge multiple files in a list with different number of rows

查看:376
本文介绍了在列表中合并多个不同行数的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列表中有多个文件,我想根据 Year 列进行合并,以便我的新文件
看起来像 Merged_file 。如果我有2个文件,我可以使用 merge(file1,file2,by =Year),但是我不知道如何在列表中的多个文件。
我也尝试过这个 newlist< - lapply(files,function(t)do.call(rbind.fill,t))但它不是我想要的。

  file1 file2 Merged_file 

年份值1年份值2年份值1价值2
2001 1 2000 0.5 2001 1 0.3
2001 2 2000 0.6 2001 2 0.3
2002 2 2001 0.3 2002 2 0.5
2002 3 2001 0.3 2002 3 0.6
2003 3 2002 0.5 2003 3 0.6
2003 4 2002 0.6 2003 4 0.6
2003 0.6
2003 0.6


解决方案

你说每个数据集中的行数不一样;但是,在任何一年中都有相同的行数吗?我感觉到你想要在同一年内获取文件的子集,并将它们组合起来( cbind ),但我不确定。看看这是否符合你的意思:

  file1<  -  read.table(text = 
年份Value1
2001 1
2001 2
2002 2
2002 3
2003 3
2003 4,header = TRUE)

file2< - read.table(text =
Year Value2
2000 0.5
2000 0.6
2001 0.3
2001 0.3
2002 0.5
2002 0.6
2003 0.6
2003 0.6,header = TRUE)

bind.by.var< - function(file1,file2,var = intersect file1),name(file2))){
do.call(rbind,lapply(intersect(file1 [[var]],file2 [[var]]),function(y){
cbind file1 [[var]] == y,],
file2 [file2 [[var]] == y,setdiff(names(file2),var),drop = FALSE])
} ))
}

函数 bind.by.var 找出哪个列两个文件h ave共同(年),那么两个文件中出现了几年。然后,逐年融合(捆绑)这几年。我不知道这是否一般是你想要的,但它与你的 Merged_file 示例

 > bind.by.var(file1,file2)
年份Value1 Value2
1 2001 1 0.3
2 2001 2 0.3
3 2002 2 0.5
4 2002 3 0.6
5 2003 3 0.6
6 2003 4 0.6

给定这个和一个文件,您可以使用 Reduce 技术。

  Reduce bind.by.var,list(file1,file2))

将您的显式列表替换为从文件中读取的data.frame的列表。



这里的假设是每个文件中任何一年的行数相同。如果不是这样,你需要解释一下你想要的数据如何组合/合并。


I have multiple files in a list and I want to merge them based on Year column, so that my new file looks like Merged_file. I could use merge(file1, file2, by="Year") if I had 2 files, but I don't know how to do that for multiple files in a list. I also tried this newlist <- lapply(files, function(t)do.call(rbind.fill, t)) but its not what I want.

file1             file2                Merged_file

Year  Value1      Year  Value2         Year Value1 Value2
2001   1          2000   0.5           2001  1       0.3
2001   2          2000   0.6           2001  2       0.3 
2002   2          2001   0.3           2002  2       0.5
2002   3          2001   0.3           2002  3       0.6
2003   3          2002   0.5           2003  3       0.6       
2003   4          2002   0.6           2003  4       0.6
                  2003   0.6
                  2003   0.6

解决方案

You say there are not the same number of rows in each data set; are there the same number of rows for any single year, though? I get the sense that you want to take subsets of the files with the same year and combine (cbind) them, but I'm not sure. See if this does what you want/mean:

file1 <- read.table(text=
"Year  Value1      
2001   1          
2001   2          
2002   2          
2002   3          
2003   3                
2003   4", header=TRUE)

file2 <- read.table(text=
"Year  Value2         
2000   0.5           
2000   0.6           
2001   0.3           
2001   0.3           
2002   0.5           
2002   0.6           
2003   0.6           
2003   0.6", header=TRUE)

bind.by.var <- function(file1, file2, var = intersect(names(file1), names(file2))) {
    do.call(rbind, lapply(intersect(file1[[var]], file2[[var]]), function(y) {
        cbind(file1[file1[[var]]==y,],
              file2[file2[[var]]==y,setdiff(names(file2),var),drop=FALSE])
    }))
}

The function bind.by.var figures out which column the two files have in common (Year), then what years appear in both files. Then, year by year, combines (binds) the years together. I don't know if this is in general what you want, but it does match your Merged_file example

> bind.by.var(file1, file2)
  Year Value1 Value2
1 2001      1    0.3
2 2001      2    0.3
3 2002      2    0.5
4 2002      3    0.6
5 2003      3    0.6
6 2003      4    0.6

Given this and a list of files, you can use the Reduce technique on it.

Reduce(bind.by.var, list(file1, file2))

where you replace the explicit list with your list of data.frame which were read in from files.

The assumption here is that there are the same number of rows for any one year in each file. If that is not the case, you need to explain how you want data from a year to be combined/merged.

这篇关于在列表中合并多个不同行数的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆