当隐含ID列时,合并csv文件中的许多数据帧吗? [英] Merge many data frames from csv files, when ID column is implied?
问题描述
我想将一堆数据帧合并在一起(因为如果您只处理w/a,似乎很多操作会更容易,但是如果我做错了,请更正我).
I'd like to merge a bunch of data frames together (because it seems many operations are easier if you're only dealing w/ one, but correct me if I'm wrong).
目前,我有一个这样的数据帧:
Currently I have one data frame like this:
ID, var1, var2
A, 2, 2
B, 4, 5
.
.
Z, 3, 2
每个ID都在一行中,并带有多个单独的测量值
Each ID is on a single row w/ several single measurements
我还有一个csv文件,其中包含对每个ID的重复测量,例如:
I also have a csv file w/ repeated measurement for each ID, like:
文件名= ID_B.csv
filename = ID_B.csv
time, var4, var5
0, 1, 2
1, 4, 5
2, 1, 6
...
我想要的是:
ID, time, va1, var2, var4, var5
...
B, 0, 4, 5, 1, 2,
B, 1, 4, 5, 4, 5,
B, 2, 4, 5, 1, 6,
...
我真的不在乎列顺序.我能想到的唯一解决方案是将ID列添加到每个csv文件,然后遍历它们多次调用merge()
.有没有更优雅的方法?
I don't really care about the column order. The only solution I can think of is to add the ID column to each csv file then loop through them calling merge()
several times. Is there a more elegant approach?
推荐答案
我的理解是,您需要从文件名中提取ID,然后将导入的csv与现有数据框合并.
My understanding is that you need to extract the ID from the filename, and then merge the imported csv with the existing dataframe.
df1 <- read.csv(textConnection("ID, var1, var2
A, 2, 2
B, 4, 5"))
# assuming the imported csv-files are in working directory
filenames <- list.files(getwd(), pattern = "ID_[A-Z].csv")
# extract ID from filename
ids <- gsub("ID_([A-Z]).csv", "\\1", filenames)
# import csv-files and append ID
library(plyr)
import <- mdply(filenames, read.csv)
import$ID <- ids[import$Var1]
import$Var1 <- NULL
# merge imported csv-files and the existing dataframe
merge(df1, import)
结果:
ID var1 var2 time var4 var5
1 B 4 5 0 1 2
2 B 4 5 1 4 5
3 B 4 5 2 1 6
这篇关于当隐含ID列时,合并csv文件中的许多数据帧吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!