在Shiny中上传多个文件,使用lapply处理文件,查找结果并返回下载 [英] Uploading multiple files in Shiny, process the files using lapply, rbind the results and return a download
问题描述
在我以前的发布后,我可以使用for
循环在Shiny中上传多个文件,处理文件,查找结果并返回CSV文件下载.感谢 @SBista 的贡献.但是,由于我必须一次上传很多文件(总大小约为50-100mb),因此我发现运行闪亮的应用程序的速度非常慢,这可能是由于for
循环所致.我知道lapply()
在读取多个csv文件中比for
更快>循环,但是在我的代码中应用lapply()
会在运行应用程序后出现错误(错误:无效的'description'参数).任何帮助将不胜感激.这是我的虚拟文件,这是我的代码:
In my previous post, I was able to upload multiple files in Shiny, process the files, rbind the results and return a csv file download, using a for
loop. Thanks for the contribution of @SBista. However, because I have to upload a lot of files at a time (total size of about 50 - 100mb), I found running the shiny app to be very slow perhaps due to the for
loop. I know that lapply()
is faster in reading multiple csv files than for
loop, but applying lapply()
in my code gives an error (ERROR: Invalid 'description' argument ) after running the app. Any help will be appreciated. This is my dummy file, and this is my code:
library(shiny)
ui <- fluidPage(
fluidPage(
titlePanel("MY CSV FILES MERGER"),
sidebarLayout(
sidebarPanel(
fileInput("file1",
"Choose CSV files from a directory",
multiple = TRUE,
accept=c('text/csv',
'text/comma-separated-values,text/plain',
'.csv')),
downloadButton('downloadData', 'Download')
),
mainPanel(
tableOutput('contents')
)
)
)
)
library(shiny)
library(dplyr)
options(shiny.maxRequestSize = 100*1024^2)
server <- function(input, output) {
getData <- reactive({
inFile <- input$file1
if (is.null(inFile)){
return(NULL)
}else {
files3 = lapply(inFile, function(y){
JSON_csv = read.csv(y, header = TRUE)
lastrow = nrow(JSON_csv)
shift = function(x, n){
c(x[-(seq(n))], rep(NA, n))
}
JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
JSON_csv = JSON_csv[-lastrow, ]
JSON_csv
}
)
do.call(rbind, files3)
}
})
output$contents <- renderTable(
getData()
)
output$downloadData <- downloadHandler(
filename = function() {
paste("data-", Sys.time(), ".csv", sep="")
},
content = function(file) {
write.csv(getData(), file, row.names=FALSE)
})
}
shinyApp(ui = ui, server = server)
使用for
循环,此代码可以工作,但是在处理多个50-100mb的csv文件时非常慢:
With for
loop, this code works but it is very very slow while working with multiple csv files of 50-100mb:
library(shiny)
library(dplyr)
server <- function(input, output) {
getData <- reactive({
inFile <- input$file1
if (is.null(inFile)){
return(NULL)
}else {
# browser()
numfiles = nrow(inFile)
kata_csv1 = list()
for (i in 1:numfiles)
{
JSON_csv = read.csv(input$file1[[i, 'datapath']], header = TRUE)
lastrow = nrow(JSON_csv)
shift = function(x, n){
c(x[-(seq(n))], rep(NA, n))
}
JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
kata_csv1[[i]] = JSON_csv[-lastrow, ]
}
# browser()
do.call(rbind, kata_csv1)
}
})
output$contents <- renderTable(
getData()
)
output$downloadData <- downloadHandler(
filename = function() {
paste("data-", Sys.Date(), ".csv", sep="")
},
content = function(file) {
write.csv(getData(), file, row.names=FALSE)
})
}
shinyApp(ui = ui, server = server)
推荐答案
问题是,当您将inFile
传递给lapply
时,实际上只传递了包含文件名的第一列.相反,您需要传递inFile$datapath
.讨厌的人应该是这样的:
the problem is when you are passing inFile
to lapply
you are actually passing only the first column containing the filename. Instead you'll need to passinFile$datapath
. The lapply should be like this:
files3 = lapply(inFile$datapath, function(y){
JSON_csv = read.csv(y, header = TRUE)
lastrow = nrow(JSON_csv)
shift = function(x, n){
c(x[-(seq(n))], rep(NA, n))
}
JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
JSON_csv = JSON_csv[-lastrow, ]
JSON_csv
}
希望有帮助!
这篇关于在Shiny中上传多个文件,使用lapply处理文件,查找结果并返回下载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!