在Web应用程序中创建和下载大量ZIP(来自多个BLOB)的最佳实践 [英] Best Practices to Create and Download a huge ZIP (from several BLOBs) in a WebApp

查看:126
本文介绍了在Web应用程序中创建和下载大量ZIP(来自多个BLOB)的最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



这显然是一个长期运行的动作(它会被使用每年一次[per-customer] ),所以时间不是问题(除非它遇到一些超时,但我可以通过创建某种形式的keepalive心跳来处理)。我知道如何创建一个隐藏的 iframe ,并使用它与 content-disposition:attachment 来尝试下载文件,而不是在浏览器中打开它,以及如何实例化客户端 - 服务器通信以绘制进度计;

下载的实际大小(以及文件数量)是未知的,但为了简单我们可以虚拟地认为它是1GB,由100个文件组成,每个10MB。

由于这应该是一次单击操作,我首先想到的是将所有文件,同时从数据库中读取,在一个动态生成的ZIP中,然后要求用户保存ZIP。



问题是:什么是最佳实践,以及什么是已知的缺点和陷阱,从WebApp中的多个小字节数组创建一个巨大的档案?



可以随机分成:

$ ul

  • 应该将每个字节数组转换为物理临时文件,或者不能嘿被添加到内存中的ZIP?

  • 如果是的话,我知道我将不得不处理可能的相等的名称(他们可以在数据库中的不同记录中具有相同的名称,但不在同一个文件系统或ZIP里):是否还有其他可能出现的问题(假设文件系统总是有足够的物理空间)?
  • 因为我不能依赖在有足够的RAM来执行内存中的整个操作的时候,我猜测ZIP应该被创建并且在发送给用户之前被馈送到文件系统;有没有什么办法可以做不同的事情(比如用 websocket ),就像询问用户在哪里保存文件,然后从服务器开始一个持续的数据流到客户端(我猜)



  • $其他相关的已知问题或最佳实践,对于不能同时存储在内存中的大内容,请将数据库中的内容流式传输到响应中。

    这种东西其实很简单。您不需要AJAX或WebSockets,可以通过用户点击的简单链接来流式传输大文件。而现代的浏览器有自己的进度条体面的下载管理器 - 为什么重新发明轮子?

    如果从头开始写一个servlet,访问数据库BLOB,获取其输入流并将内容复制到HTTP响应输出流。如果您有Apache Commons IO库,则可以使用 IOUtils.copy(),否则你可以自己做。



    创建一个ZIP文件在飞行中可以用 ZipOutputStream 。通过响应输出流(从servlet或任何你的框架提供的)创建其中的一个,然后从数据库获取每个BLOB,首先使用 putNextEntry(),然后流每一个BLOB,如前所述。



    潜在的陷阱/问题:


    • 下载大小和网络速度,请求可能需要很长时间才能完成。防火墙等可以阻碍这种情况,并尽早终止请求。

    • 希望您的用户在请求这些文件时可以使用合适的公司网络。如果在下载1.9G的2.0G之后退出,用户必须重新启动,那么远程/移动/移动连接的情况会更糟糕。

    • 可以加载一些负载在你的服务器上,特别是压缩巨大的ZIP文件。如果这是一个问题,创建 ZipOutputStream 时可能需要关闭或关闭压缩。

    • <2>超过2GB的ZIP文件这4 GB)可能有一些ZIP程序的问题。我认为最新的Java 7使用ZIP64扩展,所以这个版本的Java将正确地写入巨大的ZIP,但客户端将有支持大型zip文件的程序?我以前肯定遇到过这些问题,特别是在旧的Solaris服务器上

    I will need to perform a massive download of files from my Web Application.

    It is obviously expected to be a long-running action (it'll be used once-per-year[-per-customer]), so the time is not a problem (unless it hits some timeout, but I can handle that by creating some form of keepalive heartbeating). I know how to create an hidden iframe and use it with content-disposition: attachment to attempt to download the file instead of opening it inside the browser, and how to instance a client-server communication for drawing a progress meter;

    The actual size of the download (and the number of files) is unknown, but for simplicity we can virtually consider it as 1GB, composed of 100 files, each 10MB.

    Since this should be a one-click operation, my first thought was to group all the files, while reading them from the database, in a dynamically generated ZIP, then ask the user to save the ZIP.

    The question is: what are the best practices, and what are the known drawbacks and traps, in creating a huge archive from multiple small byte arrays in a WebApp?

    That can be randomly split into:

    • should each byte array be converted in a physical temp file, or can they be added to the ZIP in memory ?
    • if yes, I know I'll have to handle the possible equality of names (they can have the same name in different records in the database, but not inside the same file system nor ZIP): are there any other possible problems that come to mind (assuming the file system always has enough physical space) ?
    • since I can't rely on having enough RAM to perform the whole operation in memory, I guess the ZIP should be created and fed to the file system before being sent to the user; is there any way to do it differently (eg with websocket), like asking the user where to save the file, and then starting a constant flow of data from the server to client (Sci-Fi I guess) ?
    • any other related known problems or best practices that cross your mind would be greatly appreciated.

    解决方案

    For large content that won't fit in memory at once, stream the content from the database to the response.

    This kind of thing is actually pretty simple. You don't need AJAX or websockets, it's possible to stream large file downloads through a simple link that the user clicks on. And modern browsers have decent download managers with their own progress bars - why reinvent the wheel?

    If writing a servlet from scratch for this, get access to the database BLOB, getting its input stream and copy content through to the HTTP response output stream. If you have Apache Commons IO library, you can use IOUtils.copy(), otherwise you can do this yourself.

    Creating a ZIP file on the fly can be done with a ZipOutputStream. Create one of these over the response output stream (from the servlet or whatever your framework gives you), then get each BLOB from the database, using putNextEntry() first and then streaming each BLOB as described before.

    Potential Pitfalls/Issues:

    • Depending on the download size and network speed, the request might take a lot of time to complete. Firewalls, etc. can get in the way of this and terminate the request early.
    • Hopefully your users are on a decent corporate network when requesting these files. It would be far worse over remote/dodgey/mobile connections (if it drops out after downloading 1.9G of 2.0G, users have to start again).
    • It can put a bit of load on your server, especially compressing huge ZIP files. It might be worth turning compression down/off when creating the ZipOutputStream if this is a problem.
    • ZIP files over 2GB (or is that 4 GB) might have issues with some ZIP programs. I think the latest Java 7 uses ZIP64 extensions, so this version of Java will write the huge ZIP correctly but will the clients have programs that support the large zip files? I've definitely run into issues with these before, especially on old Solaris servers

    这篇关于在Web应用程序中创建和下载大量ZIP(来自多个BLOB)的最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆