在Shiny中上传多个文件,使用lapply处理文件,查找结果并返回下载 [英] Uploading multiple files in Shiny, process the files using lapply, rbind the results and return a download

查看:225
本文介绍了在Shiny中上传多个文件,使用lapply处理文件,查找结果并返回下载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我以前的发布后,我可以使用for循环在Shiny中上传多个文件,处理文件,查找结果并返回CSV文件下载.感谢 @SBista 的贡献.但是,由于我必须一次上传很多文件(总大小约为50-100mb),因此我发现运行闪亮的应用程序的速度非常慢,这可能是由于for循环所致.我知道lapply()读取多个csv文件中比for更快>循环,但是在我的代码中应用lapply()会在运行应用程序后出现错误(错误:无效的'description'参数).任何帮助将不胜感激.这是我的虚拟文件,这是我的代码:

In my previous post, I was able to upload multiple files in Shiny, process the files, rbind the results and return a csv file download, using a for loop. Thanks for the contribution of @SBista. However, because I have to upload a lot of files at a time (total size of about 50 - 100mb), I found running the shiny app to be very slow perhaps due to the for loop. I know that lapply() is faster in reading multiple csv files than for loop, but applying lapply() in my code gives an error (ERROR: Invalid 'description' argument ) after running the app. Any help will be appreciated. This is my dummy file, and this is my code:

 library(shiny)

 ui <- fluidPage(
   fluidPage(
     titlePanel("MY CSV FILES MERGER"),
     sidebarLayout(
       sidebarPanel(
         fileInput("file1",
              "Choose CSV files from a directory",
              multiple = TRUE,
              accept=c('text/csv', 
                       'text/comma-separated-values,text/plain', 
                       '.csv')),
         downloadButton('downloadData', 'Download')
       ),
       mainPanel(
         tableOutput('contents')
       )
     )
   )
 )

 library(shiny)
 library(dplyr)
 options(shiny.maxRequestSize = 100*1024^2)
 server <-  function(input, output) {
   getData <- reactive({
     inFile <- input$file1
     if (is.null(inFile)){
       return(NULL)
     }else {   
      files3 = lapply(inFile, function(y){
        JSON_csv = read.csv(y, header = TRUE)
        lastrow = nrow(JSON_csv)
        shift = function(x, n){
          c(x[-(seq(n))], rep(NA, n))
        }
        JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
        JSON_csv = JSON_csv[-lastrow, ]
        JSON_csv 
      }

                 )
       do.call(rbind, files3)
     }
   })
   output$contents <- renderTable( 
     getData() 
   )
   output$downloadData <- downloadHandler(
     filename = function() { 
       paste("data-", Sys.time(), ".csv", sep="")
     },
     content = function(file) { 
       write.csv(getData(), file, row.names=FALSE)   
     })
 }

 shinyApp(ui = ui, server = server)

使用for循环,此代码可以工作,但是在处理多个50-100mb的csv文件时非常慢:

With for loop, this code works but it is very very slow while working with multiple csv files of 50-100mb:

 library(shiny)
 library(dplyr)
 server <-  function(input, output) {
 getData <- reactive({
  inFile <- input$file1
  if (is.null(inFile)){
    return(NULL)
  }else {
    # browser()
    numfiles = nrow(inFile) 
    kata_csv1 = list()


    for (i in 1:numfiles)
    {

      JSON_csv = read.csv(input$file1[[i, 'datapath']], header = TRUE)
      lastrow = nrow(JSON_csv)
      shift = function(x, n){
        c(x[-(seq(n))], rep(NA, n))
      }
      JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
      kata_csv1[[i]] = JSON_csv[-lastrow, ]

    }
    # browser()
    do.call(rbind, kata_csv1)
     }
   })
  output$contents <- renderTable( 
  getData() 
  )
  output$downloadData <- downloadHandler(
  filename = function() { 
    paste("data-", Sys.Date(), ".csv", sep="")
  },
  content = function(file) { 
    write.csv(getData(), file, row.names=FALSE)   
  })
  }

 shinyApp(ui = ui, server = server)

推荐答案

问题是,当您将inFile传递给lapply时,实际上只传递了包含文件名的第一列.相反,您需要传递inFile$datapath.讨厌的人应该是这样的:

the problem is when you are passing inFile to lapply you are actually passing only the first column containing the filename. Instead you'll need to passinFile$datapath. The lapply should be like this:

   files3 = lapply(inFile$datapath, function(y){

     JSON_csv = read.csv(y, header = TRUE)
     lastrow = nrow(JSON_csv)
     shift = function(x, n){
       c(x[-(seq(n))], rep(NA, n))
     }
     JSON_csv$companyID1 = shift(JSON_csv$companyID1, 1)
     JSON_csv = JSON_csv[-lastrow, ]
     JSON_csv 
   }

希望有帮助!

这篇关于在Shiny中上传多个文件,使用lapply处理文件,查找结果并返回下载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆