合并R中的文件(和文件名) [英] Merging files (and file names) in R
问题描述
我试图用R来合并一个由逗号分隔的文本文件组成的目录,同时也将每个文件的文件名作为一个新变量合并到数据集中。
我一直在使用以下内容:
library(plyr )
file_list< - list.files()
dataset< - ldply(file_list,read.table,header = FALSE,sep =,)
任何人都可以阐明如何为数据集中的每个读取的文件添加文件名称吗?
非常感谢,
-Jon
您可以在 read.table()
函数中添加一个包装器,它将添加到您的文件名变量中。像这样的东西应该可以工作:
$ $ $ $ $ $ $ $ $ readdata< - function(file){
dat< - read .table(file,header = F,sep =,)
dat $ fname< - file
return(dat)
}
一旦出现,您只需要在数据文件中应用该功能即可。由于您没有发布任何示例数据,我不确定它实际上是什么样子,但现在我认为它是干净的,
>数据(虹膜)
> write.csv(iris,file =iris1.csv,row.names = F)
> write.csv(iris,file =iris2.csv,row.names = F)
>数据集< - do.call(rbind,lapply(list.files(pattern =csv $),read.data))
>头(数据集)
Sepal.Length Sepal.Width Petal.Length Petal.Width物种fname
1 5.1 3.5 1.4 0.2 setosa iris1.csv
2 4.9 3.0 1.4 0.2 setosa iris1.csv
3 4.7 3.2 1.3 0.2 setosa iris1.csv
4 4.6 3.1 1.5 0.2 setosa iris1.csv
5 5.0 3.6 1.4 0.2 setosa iris1.csv
6 5.4 3.9 1.7 0.4 setosa iris1.csv
>表(数据集$ fname)
iris1.csv iris2.csv
150 150
I'm trying to merge a directory full of comma delimited text files using R, while also incorporating the file name of each file as a new variable in the data set.
I've been using the following:
library(plyr)
file_list <- list.files()
dataset <- ldply(file_list, read.table, header=FALSE, sep=",")
Can anyone shed any light on how I'd add the file name for each file read as a new variable within dataset?
Many thanks,
-Jon
You can just make a wrapper around the read.table()
function that adds in your filename variable. Something like this should work:
read.data <- function(file){
dat <- read.table(file,header=F,sep=",")
dat$fname <- file
return(dat)
}
Once there you just need to apply that function across your data files. Since you didn't post any example data I'm not sure what it actually looks like, but for now I'll assume it's clean as can be and that rbind()
is sufficient to join them together, in which case this example should illustrate that function in action:
> data(iris)
> write.csv(iris,file="iris1.csv",row.names=F)
> write.csv(iris,file="iris2.csv",row.names=F)
> dataset <- do.call(rbind, lapply(list.files(pattern="csv$"),read.data))
> head(dataset)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species fname
1 5.1 3.5 1.4 0.2 setosa iris1.csv
2 4.9 3.0 1.4 0.2 setosa iris1.csv
3 4.7 3.2 1.3 0.2 setosa iris1.csv
4 4.6 3.1 1.5 0.2 setosa iris1.csv
5 5.0 3.6 1.4 0.2 setosa iris1.csv
6 5.4 3.9 1.7 0.4 setosa iris1.csv
> table(dataset$fname)
iris1.csv iris2.csv
150 150
这篇关于合并R中的文件(和文件名)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!