如何在R中使用具有特定行和列的循环读取多个xlsx文件 [英] How to read multiple xlsx file in R using loop with specific rows and columns

查看:841
本文介绍了如何在R中使用具有特定行和列的循环读取多个xlsx文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须将具有随机名称的多个xlsx文件读入单个数据帧。每个文件的结构是一样的。我必须仅导入特定的列。



我尝试过:

  dat < -  read.xlsx(FILE.xlsx,sheetIndex = 1,
sheetName = NULL,startRow = 5,
endRow = NULL,as.data.frame = TRUE,
header = TRUE)

但这只是一次只有一个文件,我无法指定我的专栏。
我甚至尝试:

  site = list.files(pattern ='[。] xls')

但是在该循环不起作用之后。怎么做?谢谢提前。

解决方案

我将阅读每张表格到列表:



获取文件名:

  f = list.files(./)

读取文件:

  dat = lapply(f,function(i){
x = read.xlsx(i,sheetIndex = 1,sheetName = NULL,startRow = 5,
endRow = NULL,as.data.frame = TRUE,header = T)
#获取所需的列,例如1,3,5
x = x [,c(1,3,5)]
#你可能想添加一个列来说哪个文件来自
x $ file = i
#返回数据
x
})

然后,您可以使用以下方式访问列表中的项目:

  dat [[1 ]] 

或者对他们执行相同的任务:

  lapply(dat,colmeans)

转它们变成一个数据框(你的文件列现在变得有用):

  dat = do.call(rbind.data.frame,dat)


I have to read multiple xlsx file with random names into single dataframe. Structure of each file is same. I have to import specific columns only.

I tried this:

dat <- read.xlsx("FILE.xlsx", sheetIndex=1, 
                  sheetName=NULL, startRow=5, 
                  endRow=NULL, as.data.frame=TRUE, 
                  header=TRUE)

But this is for only one file at a time and I couldn't specify my particular columns. I even tried :

site=list.files(pattern='[.]xls')

but after that loop isn't working. How to do it? Thanks in advance.

解决方案

I would read each sheet to a list:

Get file names:

f = list.files("./")

Read files:

dat = lapply(f, function(i){
    x = read.xlsx(i, sheetIndex=1, sheetName=NULL, startRow=5,
        endRow=NULL, as.data.frame=TRUE, header=T)
    # Get the columns you want, e.g. 1, 3, 5
    x = x[, c(1, 3, 5)]
    # You may want to add a column to say which file they're from
    x$file = i
    # Return your data
    x
})

You can then access the items in your list with:

dat[[1]]

Or do the same task to them with:

lapply(dat, colmeans)

Turn them into a data frame (where your file column now becomes useful):

dat = do.call("rbind.data.frame", dat)

这篇关于如何在R中使用具有特定行和列的循环读取多个xlsx文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆