将结果叠加到R中的一个主文件中 [英] To stack up results in one masterfile in R

查看:130
本文介绍了将结果叠加到R中的一个主文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用此脚本我创建了一个特定的文件夹为每个csv文件,然后保存所有我进一步的分析结果在此文件夹。文件夹的名称和csv文件是相同的。 csv文件存储在main / master目录中。
现在,我在每个文件夹中创建了一个csv文件,其中包含所有拟合值的列表。

Using this script I have created a specific folder for each csv file and then saved all my further analysis results in this folder. The name of the folder and csv file are same. The csv files are stored in the main/master directory. Now, I have created a csv file in each of these folders which contains a list of all the fitted values.

现在我想执行以下操作:

I would now like to do the following:


  1. 设置工作目录到特定文件名

  2. 读取拟合值文件

  3. 添加一行/列,说明网站/唯一ID的名称

  4. 将它添加到存储在主目录中的主文件中,并使用指定站点名称/文件名的标题。

  5. 进入主目录以选择下一个文件

  6. 重复循环

  1. Set the working directory to the particular filename
  2. Read fitted values file
  3. Add a row/column stating the name of the site/ unique ID
  4. Add it to the masterfile which is stored in the main directory with a title specifying site name/filename. It can be stacked by rows or by columns it doesn't really matter.
  5. Come to the main directory to pick the next file
  6. Repeat the loop

使用 merge() rbind () cbind()组合一个列名称下的所有数据。我想保留所有的网站,以便在稍后阶段进行比较。

Using the merge(), rbind(), cbind() combines all the data under one column name. I want to keep all the sites separate for comparison at a later on stage.

这是我目前使用的,我失去了如何继续

This is what I'm using at the moment and I'm lost on how to proceed further.

setwd( "path")  # main directory
path <-"path"  # need this for convenience while switching back to main directory

# import all files and create a character type array
files <- list.files(path=path, pattern="*.csv")

for(i in seq(1, length(files), by = 1)){

      fileName <- read.csv(files[i]) # repeat to set the required working directory
      base <- strsplit(files[i], ".csv")[[1]]   # getting the filename
      setwd(file.path(path, base))   # setting the working directory to the same filename
      master <- read.csv(paste(base,"_fiited_values curve.csv"))
    # read the fitted value csv file for the site and store it in a list
    }



我想构造一个for循环,文件在不同的目录。我不想合并所有在一个列名称下。

I want to construct a for loop to make one master file with the files in different directories. I do not want to merge all under one column name.

例如,如果我有50个类似的csv文件,每个有两列数据,我想有一个csv文件,它容纳所有的;但是采用其原始格式,而不是附加到现有的行/列。所以我将有100列数据。

For example, If I have 50 similar csv files and each had two columns of data, I would like to have one csv file which accommodates all of it; but in its original format rather than appending to the existing row/column. So then I will have 100 columns of data.

请告诉我可以提供哪些进一步的信息?

Please tell me what further information can I provide?

推荐答案

用于从多个不同目录中读取一组文件,路径名为 patha pathb pathc

for reading a group of files, from a number of different directories, with pathnames patha pathb pathc:

paths = c('patha','pathb','pathc')
files = unlist(sapply(paths, function(path) list.files(path,pattern = "*.csv", full.names = TRUE)))

listContainingAllFiles = lapply(files, read.csv)

如果你想真的快点,你可以抓取数据.table:

If you want to be really quick about it, you can grab fread from data.table:

library(data.table)
listContainingAllFiles = lapply(files, fread)

任何一种方式,这将给你一个所有对象的列表,保持分开。如果你想垂直/水平地连接它们,则:

Either way this will give you a list of all objects, kept separate. If you want to join them together vertically/horizontally, then:

do.call(rbind, listContainingAllFiles)
do.call(cbind, listContainingAllFiles)

编辑:注意,后者没有意义,除非你的行实际上意味着当它们对应时。只要创建一个跟踪数据来自哪个位置的字段就更有意义了。

NOTE, the latter makes no sense unless your rows actually mean something when they're corresponding. It makes far more sense to just create a field tracking what location the data is from.

如果要包括文件的名称作为确定样本位置的方法(我没有看到你从你的例子中得到这个信息),那么你想这样做,因为你阅读的文件,所以:

if you want to include the names of the files as the method of determining sample location (I don't see where you're getting this info from in your example), then you want to do this as you read in the files, so:

listContainingAllFiles = lapply(files, 
                            function(file) data.frame(filename = file,
                                                      read.csv(file)))

然后您可以拆分该列以获取您的详细信息(假设您有一个标准的命名约定)

then later you can split that column to get your details (Assuming of course you have a standard naming convention)

这篇关于将结果叠加到R中的一个主文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆