合并R中的文件(和文件名) [英] Merging files (and file names) in R

查看:209
本文介绍了合并R中的文件(和文件名)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用R来合并一个由逗号分隔的文本文件组成的目录,同时也将每个文件的文件名作为一个新变量合并到数据集中。



我一直在使用以下内容:

  library(plyr )
file_list< - list.files()
dataset< - ldply(file_list,read.table,header = FALSE,sep =,)

任何人都可以阐明如何为数据集中的每个读取的文件添加文件名称吗?



非常感谢,

-Jon

解决方案

您可以在 read.table()函数中添加一个包装器,它将添加到您的文件名变量中。像这样的东西应该可以工作:

$ $ $ $ $ $ $ $ $ readdata< - function(file){
dat< - read .table(file,header = F,sep =,)
dat $ fname< - file
return(dat)
}

一旦出现,您只需要在数据文件中应用该功能即可。由于您没有发布任何示例数据,我不确定它实际上是什么样子,但现在我认为它是干净的, rbind()足以将它们连接在一起,在这种情况下,这个例子应该说明这个函数的作用:

 >数据(虹膜)
> write.csv(iris,file =iris1.csv,row.names = F)
> write.csv(iris,file =iris2.csv,row.names = F)
>数据集< - do.call(rbind,lapply(list.files(pattern =csv $),read.data))
>头(数据集)
Sepal.Length Sepal.Width Petal.Length Petal.Width物种fname
1 5.1 3.5 1.4 0.2 setosa iris1.csv
2 4.9 3.0 1.4 0.2 setosa iris1.csv
3 4.7 3.2 1.3 0.2 setosa iris1.csv
4 4.6 3.1 1.5 0.2 setosa iris1.csv
5 5.0 3.6 1.4 0.2 setosa iris1.csv
6 5.4 3.9 1.7 0.4 setosa iris1.csv
>表(数据集$ fname)

iris1.csv iris2.csv
150 150


I'm trying to merge a directory full of comma delimited text files using R, while also incorporating the file name of each file as a new variable in the data set.

I've been using the following:

library(plyr)
file_list <- list.files()
dataset <- ldply(file_list, read.table, header=FALSE, sep=",")

Can anyone shed any light on how I'd add the file name for each file read as a new variable within dataset?

Many thanks,

-Jon

解决方案

You can just make a wrapper around the read.table() function that adds in your filename variable. Something like this should work:

read.data <- function(file){
  dat <- read.table(file,header=F,sep=",")
  dat$fname <- file
  return(dat)
}

Once there you just need to apply that function across your data files. Since you didn't post any example data I'm not sure what it actually looks like, but for now I'll assume it's clean as can be and that rbind() is sufficient to join them together, in which case this example should illustrate that function in action:

> data(iris)
> write.csv(iris,file="iris1.csv",row.names=F)
> write.csv(iris,file="iris2.csv",row.names=F)
> dataset <- do.call(rbind, lapply(list.files(pattern="csv$"),read.data))
> head(dataset)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species     fname
1          5.1         3.5          1.4         0.2  setosa iris1.csv
2          4.9         3.0          1.4         0.2  setosa iris1.csv
3          4.7         3.2          1.3         0.2  setosa iris1.csv
4          4.6         3.1          1.5         0.2  setosa iris1.csv
5          5.0         3.6          1.4         0.2  setosa iris1.csv
6          5.4         3.9          1.7         0.4  setosa iris1.csv
> table(dataset$fname)

iris1.csv iris2.csv 
      150       150 

这篇关于合并R中的文件(和文件名)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆