在列表中合并多个不同行数的文件 [英] Merge multiple files in a list with different number of rows
问题描述
Year
列进行合并,以便我的新文件看起来像
Merged_file
。如果我有2个文件,我可以使用 merge(file1,file2,by =Year)
,但是我不知道如何在列表中的多个文件。 我也尝试过这个
newlist< - lapply(files,function(t)do.call(rbind.fill,t))
但它不是我想要的。 file1 file2 Merged_file
年份值1年份值2年份值1价值2
2001 1 2000 0.5 2001 1 0.3
2001 2 2000 0.6 2001 2 0.3
2002 2 2001 0.3 2002 2 0.5
2002 3 2001 0.3 2002 3 0.6
2003 3 2002 0.5 2003 3 0.6
2003 4 2002 0.6 2003 4 0.6
2003 0.6
2003 0.6
你说每个数据集中的行数不一样;但是,在任何一年中都有相同的行数吗?我感觉到你想要在同一年内获取文件的子集,并将它们组合起来( cbind
),但我不确定。看看这是否符合你的意思:
file1< - read.table(text =
年份Value1
2001 1
2001 2
2002 2
2002 3
2003 3
2003 4,header = TRUE)
file2< - read.table(text =
Year Value2
2000 0.5
2000 0.6
2001 0.3
2001 0.3
2002 0.5
2002 0.6
2003 0.6
2003 0.6,header = TRUE)
bind.by.var< - function(file1,file2,var = intersect file1),name(file2))){
do.call(rbind,lapply(intersect(file1 [[var]],file2 [[var]]),function(y){
cbind file1 [[var]] == y,],
file2 [file2 [[var]] == y,setdiff(names(file2),var),drop = FALSE])
} ))
}
函数 bind.by.var
找出哪个列两个文件h ave共同(年),那么两个文件中出现了几年。然后,逐年融合(捆绑)这几年。我不知道这是否一般是你想要的,但它与你的 Merged_file
示例
> bind.by.var(file1,file2)
年份Value1 Value2
1 2001 1 0.3
2 2001 2 0.3
3 2002 2 0.5
4 2002 3 0.6
5 2003 3 0.6
6 2003 4 0.6
给定这个和一个文件,您可以使用 Reduce
技术。
Reduce bind.by.var,list(file1,file2))
将您的显式列表替换为从文件中读取的data.frame的列表。
这里的假设是每个文件中任何一年的行数相同。如果不是这样,你需要解释一下你想要的数据如何组合/合并。
I have multiple files in a list and I want to merge them based on Year
column, so that my new file
looks like Merged_file
. I could use merge(file1, file2, by="Year")
if I had 2 files, but I don't know how to do that for multiple files in a list.
I also tried this newlist <- lapply(files, function(t)do.call(rbind.fill, t))
but its not what I want.
file1 file2 Merged_file
Year Value1 Year Value2 Year Value1 Value2
2001 1 2000 0.5 2001 1 0.3
2001 2 2000 0.6 2001 2 0.3
2002 2 2001 0.3 2002 2 0.5
2002 3 2001 0.3 2002 3 0.6
2003 3 2002 0.5 2003 3 0.6
2003 4 2002 0.6 2003 4 0.6
2003 0.6
2003 0.6
You say there are not the same number of rows in each data set; are there the same number of rows for any single year, though? I get the sense that you want to take subsets of the files with the same year and combine (cbind
) them, but I'm not sure. See if this does what you want/mean:
file1 <- read.table(text=
"Year Value1
2001 1
2001 2
2002 2
2002 3
2003 3
2003 4", header=TRUE)
file2 <- read.table(text=
"Year Value2
2000 0.5
2000 0.6
2001 0.3
2001 0.3
2002 0.5
2002 0.6
2003 0.6
2003 0.6", header=TRUE)
bind.by.var <- function(file1, file2, var = intersect(names(file1), names(file2))) {
do.call(rbind, lapply(intersect(file1[[var]], file2[[var]]), function(y) {
cbind(file1[file1[[var]]==y,],
file2[file2[[var]]==y,setdiff(names(file2),var),drop=FALSE])
}))
}
The function bind.by.var
figures out which column the two files have in common (Year), then what years appear in both files. Then, year by year, combines (binds) the years together. I don't know if this is in general what you want, but it does match your Merged_file
example
> bind.by.var(file1, file2)
Year Value1 Value2
1 2001 1 0.3
2 2001 2 0.3
3 2002 2 0.5
4 2002 3 0.6
5 2003 3 0.6
6 2003 4 0.6
Given this and a list of files, you can use the Reduce
technique on it.
Reduce(bind.by.var, list(file1, file2))
where you replace the explicit list with your list of data.frame which were read in from files.
The assumption here is that there are the same number of rows for any one year in each file. If that is not the case, you need to explain how you want data from a year to be combined/merged.
这篇关于在列表中合并多个不同行数的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!