如何合并R中嵌套文件夹中的csv文件 [英] How to merge csv files from nested folders in R

查看:208
本文介绍了如何合并R中嵌套文件夹中的csv文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量的csv文件,这些文件位于不同的文件夹中以及需要合并为一个文件的文件夹中的文件夹中.如果它们都在一个目录中会很容易,但是我不知道一种简单的方法可以将它们全部拉出不同的文件夹.我可以将它们一个接一个地合并,但是有很多.

I have a large collection of csv files that are in different folders and in folders within folders that I need to merge into one file. It would be easy if they were all in one directory but I don't know of a simple way to pull them all out of the different folders. I could combine them one by one but there are A LOT of them.

例如:

+ working directory
|
+-- · data.csv
+-- · data2.csv
+-- + NewFolder
    |
    +-- · data3.csv
    +-- + NewFolder2
        |
        +-- · data4.csv

我想要一个结合了所有数据csv文件的文件

I want one file that combines all data csv files

推荐答案

您可以使用正则表达式以过滤.csv文件.一个例子:

You can use dir() with recursive set to TRUE to list all files in the folder tree, and you can use pattern to define a regular expression to filter the .csv files. An example:

csv_files <- dir(pattern='.*[.]csv', recursive = T)

甚至更好,更简单(感谢speendo的评论):

or even better and simpler (thanks to speendo for his comment):

csv_files <- dir(pattern='*.csv$', recursive = T)

说明.

  • pattern='*.csv$:pattern参数必须是用于过滤文件名的正则表达式.此RegEx筛选出以.csv结尾的文件名. 如果要过滤以data开头的内容,则应尝试以下模式:pattern='^data.*.csv$'
  • recursive=T:强制dir()递归遍历工作目录下的所有文件夹.
  • pattern='*.csv$: The pattern argument must be a regular expression that filters the file names. This RegEx filters out the file names that end with .csv.

    If you want to filter that starts with data, you should try a pattern like this: pattern='^data.*.csv$'

  • recursive=T: Forces dir() to traverse recursively through all folders below the working directory.

获得文件列表后,并假设它们都具有相同的结构(即所有文件具有相同的列),则可以将它们与read.csv()rbind()合并:

After you get the file list, and assuming all of them have the same structure (i.e. all the files have the same columns), you can merge them with read.csv() and rbind():

for(i in 1:length(csv_files)) {
  if(i == 1)
    df <- read.csv(csv_files[i])
  else
    df <- rdbind(df, read.csv(csv_files[i]))
}


Ramnath在他的评论中建议了一种更快的方式来合并.csv文件(同样,假设所有文件都具有相同的结构):


Ramnath suggests in his comment a faster way to merge the .csv files (again, assuming all of them have the same structure):

library(dplyr)
df <- rbind_all(lapply(csv_files, read_csv))

这篇关于如何合并R中嵌套文件夹中的csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆