导入所有txt文件夹中,连接成数据帧,使用文件名作为R中的变量? [英] Import all txt files in folder, concatenate into data frame, use file names as variable in R?
问题描述
我有142制表符分隔文本文件的文件夹。每个文件都有19的变量,然后下面的行数(通常为不超过30行,但它而异)。
我想自动做几件事情,在R这些文件,我似乎无法得到正是我想用我的code。我是新来的循环,我得到了code的两个部分从previous帖子在这里计算器,但似乎无法弄清楚如何将他们的功能结合起来。
I have a folder with 142 tab-delimited text files. Each file has 19 variables, and then a number of rows beneath (usually no more than 30 rows, but it varies). I want to do several things with these files in R automatically, and I can't seem to get exactly what I want with my code. I am new to loops, I got both sections of code from previous posts here at stackoverflow but can't seem to figure out how to combine their functions.
-
我想读取文件时为R,把文件名到一个变量,使每一行都有识别文件名
I want to turn the filename into a variable when reading the files into R, so that each row has the identifying file name
串连所有文件(文件名为变量并没有标头)到一个数据帧,尺寸Yx19,其中Y =然而,许多结果行也有。
Concatenate all files (with filename variable and no header) into one dataframe with dimensions Yx19, where Y=however many resulting rows there are.
我能够创建一个使用这个code 142 dataframes的列表:
I am able to create a list of the 142 dataframes using this code:
myFiles = list.files(path="~/Documents/ForR/", pattern="*.txt")
data <- lapply(myFiles, read.table, sep="\t", header=FALSE)
names(data) <- myFiles
for(i in myFiles)
data[[i]]$Source = i
do.call(rbind, data)
我能创造我想与19变量数据框,但文件名不是present:
I am able to create the dataframe I want with 19 variables, but the filename is not present:
files <- list.files(path="~/Documents/ForR/.", pattern=".txt")
DF <- NULL
for (f in files) {
dat <- read.csv(f, header=F, sep="\t", na.strings="", colClasses="character")
DF <- rbind(DF, dat)
}
我如何作为一个变量循环添加文件名(不含.TXT如果可能的话)?
How do I add the file name (without .txt if possible) as a variable to the loop?
推荐答案
添加到循环
DAT文件$&LT; - 选择不公开(。strsplit(F,分=,=固定T))[1]
add to the loop dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]
files <- list.files(path="~/Documents/ForR/.", pattern=".txt")
DF <- NULL
for (f in files) {
dat <- read.csv(f, header=F, sep="\t", na.strings="", colClasses="character")
dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]
DF <- rbind(DF, dat)
}
不应该从do.call的row.names是格式名(名单)[N] .7其中i为1:number_of_rows_for_data.frame N'所以你可以从row.names使列
Shouldn't the row.names from the do.call be in the format names(list)[n].i where i is 1:number_of_rows_for_data.frame n? so you can just make a column from the row.names
data <- lapply(myFiles, read.table, sep="\t", header=FALSE)
combined.data <- do.call(rbind, data)
combined.data$file_origin <- row.names(combined.data)
这篇关于导入所有txt文件夹中,连接成数据帧,使用文件名作为R中的变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!