将文件保存到R中的循环中的特定子文件夹中 [英] save files into a specific subfolder in a loop in R
问题描述
我觉得我已经很接近解决方案了,但是目前我不知道该怎么去.
I feel I am very close to the solution but at the moment i cant figure out how to get there.
我遇到以下问题.在我的文件夹"Test"中,我堆叠了名为 M1_1;的数据文件.M1_2
, M1_3
等:例如/Test/M1_1.dat
.不,我想分隔文件,以便得到: M1_1 [1] .dat,M1_1 [2] .dat,M1_1 [3] .dat
等.我想将这些文件保存在特定的子文件夹中: Test/M1/M1_1 [1];Test/M1/M1_1 [2]
等,以及 Test/M2/M1_2 [1],Test/M2/M1_2 [2]
等.
I´ve got the following problem.
In my folder "Test" I´ve got stacked datafiles with the names M1_1; M1_2
, M1_3
and so on: /Test/M1_1.dat
for example.
No I want to seperate the files, so that I get: M1_1[1].dat, M1_1[2].dat, M1_1[3].dat
and so on. These files I´d like to save in specific subfolders: Test/M1/M1_1[1]; Test/M1/M1_1[2]
and so on, and Test/M2/M1_2[1], Test/M2/M1_2[2]
and so on.
现在我已经创建了子文件夹.我得到以下命令来分割文件,以便得到 M1_1.dat [1]
等:
Now I already created the subfolders. And I got the following command to split up the files so that i get M1_1.dat[1]
and so on:
for (e in dir(path = "Test/", pattern = ".dat", full.names=TRUE, recursive=TRUE)){
data <- read.table(e, header=TRUE)
df <- data[ -c(2) ]
out <- split(df , f = df$.imp)
lapply(names(out),function(z){
write.table(out[[z]], paste0(e, "[",z,"].dat"),
sep="\t", row.names=FALSE, col.names = FALSE)})
}
现在,paste0命令为我提供了我想要的分割数据(尽管它的 M1_1.dat [1]
而不是 M1_1 [1] .dat
),但是我不能弄清楚如何将这些数据放入我的子文件夹.
Now the paste0 command gets me my desired split up data (although its M1_1.dat[1]
instead of M1_1[1].dat
), but i cant figure out how to get this data into my subfolders.
也许您有个主意?
提前谢谢.
推荐答案
I don't have any idea what your data looks like so I am going to attempt to recreate the scenario with the gender datasets available at baby names
假定zip文件夹中的所有文件都存储在"inst/data"中
Assuming all the files from the zip folder are stored to "inst/data"
all_fi <- list.files("inst/data",
full.names = TRUE,
recursive = TRUE,
pattern = "\\.txt$")
> head(all_fi, 3)
[1] "inst/data/yob1880.txt" "inst/data/yob1881.txt"
预设功能将应用于目录中的每个文件
f.it <- function(f_in = NULL){
# Create the new folder based on the existing basename of the input file
new_folder <- file_path_sans_ext(f_in)
dir.create(new_folder)
data.table::fread(f_in) %>%
select(name = 1, gender = 2, freq = 3) %>%
mutate(
gender = ifelse(grepl("F", gender), "female","male")
) %>% (function(x){
# Dataset contains names for males and females
# so that's what I'm using to mimic your split
out <- split(x, x$gender)
o <- rbind.pages(
lapply(names(out), function(i){
# New filename for each iteration of the split dataframes
###### THIS IS WHERE YOU NEED TO TWEAK FOR YOUR NEEDS
new_dest_file <- sprintf("%s/%s.txt", new_folder, i)
# Write the sub-data-frame to the new file
data.table::fwrite(out[[i]], new_dest_file)
# For our purposes return a dataframe with file info on the new
# files...
data.frame(
file_name = new_dest_file,
file_size = file.size(new_dest_file),
stringsAsFactors = FALSE)
})
)
o
})
}
现在我们可以循环遍历:
注意:出于我的目的,我不会花时间遍历每个文件,出于您的目的,这将适用于您的每个初始文件,或者在我的情况下是 all_fi
而不是 all_fi [2:5]
.
> rbind.pages(lapply(all_fi[2:5], f.it))
============================ =========
file_name file_size
============================ =========
inst/data/yob1881/female.txt 16476
inst/data/yob1881/male.txt 15306
inst/data/yob1882/female.txt 18109
inst/data/yob1882/male.txt 16923
inst/data/yob1883/female.txt 18537
inst/data/yob1883/male.txt 15861
inst/data/yob1884/female.txt 20641
inst/data/yob1884/male.txt 17300
============================ =========
这篇关于将文件保存到R中的循环中的特定子文件夹中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!