导入所有txt文件夹中,连接成数据帧,使用文件名作为R中的变量? [英] Import all txt files in folder, concatenate into data frame, use file names as variable in R?

查看:348
本文介绍了导入所有txt文件夹中,连接成数据帧,使用文件名作为R中的变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有142制表符分隔文本文件的文件夹。每个文件都有19的变量,然后下面的行数(通常为不超过30行,但它而异)。
我想自动做几件事情,在R这些文件,我似乎无法得到正是我想用我的code。我是新来的循环,我得到了code的两个部分从previous帖子在这里计算器,但似乎无法弄清楚如何将他们的功能结合起来。

I have a folder with 142 tab-delimited text files. Each file has 19 variables, and then a number of rows beneath (usually no more than 30 rows, but it varies). I want to do several things with these files in R automatically, and I can't seem to get exactly what I want with my code. I am new to loops, I got both sections of code from previous posts here at stackoverflow but can't seem to figure out how to combine their functions.


  1. 我想读取文件时为R,把文件名到一个变量,使每一行都有识别文件名

  1. I want to turn the filename into a variable when reading the files into R, so that each row has the identifying file name

串连所有文件(文件名为变量并没有标头)到一个数据帧,尺寸Yx19,其中Y =然而,许多结果行也有。

Concatenate all files (with filename variable and no header) into one dataframe with dimensions Yx19, where Y=however many resulting rows there are.

我能够创建一个使用这个code 142 dataframes的列表:

I am able to create a list of the 142 dataframes using this code:

myFiles = list.files(path="~/Documents/ForR/", pattern="*.txt")
data <- lapply(myFiles, read.table, sep="\t", header=FALSE)
names(data) <- myFiles
    for(i in myFiles) 
    data[[i]]$Source = i
    do.call(rbind, data)

我能创造我想与19变量数据框,但文件名不是present:

I am able to create the dataframe I want with 19 variables, but the filename is not present:

files <- list.files(path="~/Documents/ForR/.", pattern=".txt")
    DF <- NULL
        for (f in files) {
        dat <- read.csv(f, header=F, sep="\t", na.strings="", colClasses="character")
        DF <- rbind(DF, dat)
    }

我如何作为一个变量循环添加文件名(不含.TXT如果可能的话)?

How do I add the file name (without .txt if possible) as a variable to the loop?

推荐答案

添加到循环
    DAT文件$&LT; - 选择不公开(。strsplit(F,分=,=固定T))[1]

add to the loop dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]

files <- list.files(path="~/Documents/ForR/.", pattern=".txt")
    DF <- NULL
        for (f in files) {
        dat <- read.csv(f, header=F, sep="\t", na.strings="", colClasses="character")
        dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]
        DF <- rbind(DF, dat)
    }

不应该从do.call的row.names是格式名(名单)[N] .7其中i为1:number_of_rows_for_data.frame N'所以你可以从row.names使列

Shouldn't the row.names from the do.call be in the format names(list)[n].i where i is 1:number_of_rows_for_data.frame n? so you can just make a column from the row.names

data <- lapply(myFiles, read.table, sep="\t", header=FALSE)
combined.data <- do.call(rbind, data)
combined.data$file_origin <- row.names(combined.data)

这篇关于导入所有txt文件夹中,连接成数据帧,使用文件名作为R中的变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆