使用 PHP + Apache 快速生成 ZIP 文件? [英] Generating ZIP files with PHP + Apache on-the-fly in high speed?

查看:30
本文介绍了使用 PHP + Apache 快速生成 ZIP 文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

引用一些名言::><块引用>

程序员……经常躲避在他们工作中的一种可以理解但灾难性的对复杂性和独创性的倾向.禁止设计比程序更大的任何东西,他们的回应是使程序复杂到足以挑战他们的专业技能."

在解决工作中的一些平凡问题时,我想到了这个想法,但我不太确定如何解决.我知道我不会实施这个,但我很好奇最好的解决方案是什么.:)

<小时>

假设您有一个包含 JPG 文件和一些奇怪的 SWF 文件的大集合.大"的意思是几千".每个 JPG 文件大约为 200KB,而 SWF 文件的大小可达几 MB.每天都有一些新的 JPG 文件.因此,所有内容的总大小约为 1 GB,并且正在缓慢但稳定地增加.文件很少更改或删除.

用户可以在网页上单独查看每个文件.然而,也有希望允许他们一次下载一大堆.这些文件附有一些元数据(日期、类别等),用户可以根据这些元数据过滤集合.

最终的实现是允许用户指定一些过滤条件,然后将相应的文件下载为单个 ZIP 文件.

由于条件的数量足够大,我无法预先生成所有可能的 ZIP 文件,必须即时生成.另一个问题是下载可能非常大,对于连接速度较慢的用户来说,很可能需要一个小时或更长时间.因此,对简历"的支持是必不可少的.

从好的方面来说,ZIP 不需要压缩任何东西 - 无论如何,文件大多是 JPEG.因此,整个过程不应该比简单的文件下载更占用 CPU.

当时我发现的问题是:

  • PHP 的脚本执行超时.虽然它可以通过脚本本身进行更改,但完全删除它会不会有问题?
  • 使用 resume 选项,过滤器结果可能会因不同的 HTTP 请求而改变.这可以通过按时间顺序对结果进行排序来缓解,因为集合只会越来越大.然后,请求 URL 还将包括它最初创建的日期,并且脚本不会考虑比该日期更早的文件.这够了吗?
  • 通过 PHP 传递大量文件数据本身不会影响性能吗?

您将如何实现这一点?PHP 能胜任这项任务吗?<小时>添加:

现在有两个人建议将请求的 ZIP 文件存储在一个临时文件夹中,并从那里作为通常的文件提供它们.虽然这确实是一个显而易见的解决方案,但有几个实际考虑因素使其不可行.

ZIP 文件通常非常大,从几十兆字节到数百兆字节不等.用户请求一切"也是完全正常的,这意味着 ZIP 文件的大小将超过 1 GB.此外,还有许多可能的过滤器组合,其中许多很可能被用户选择.

因此,ZIP 文件的生成速度将非常缓慢(由于数据量和磁盘速度的原因),并且会多次包含整个集合.如果没有一些非常昂贵的 SCSI RAID 阵列,我看不出这个解决方案将如何工作.

解决方案

这可能是您所需要的:http://pablotron.org/software/zipstream-php/

此库允许您构建动态流式 zip 文件,而无需交换到磁盘.

To quote some famous words:

"Programmers… often take refuge in an understandable, but disastrous, inclination towards complexity and ingenuity in their work. Forbidden to design anything larger than a program, they respond by making that program intricate enough to challenge their professional skill."

While solving some mundane problem at work I came up with this idea, which I'm not quite sure how to solve. I know I won't be implementing this, but I'm very curious as to what the best solution is. :)


Suppose you have this big collection with JPG files and a few odd SWF files. With "big" I mean "a couple thousand". Every JPG file is around 200KB, and the SWFs can be up to a few MB in size. Every day there's a few new JPG files. The total size of all the stuff is thus around 1 GB, and is slowly but steadily increasing. Files are VERY rarely changed or deleted.

The users can view each of the files individually on the webpage. However there is also the wish to allow them to download a whole bunch of them at once. The files have some metadata attached to them (date, category, etc.) that the user can filter the collection by.

The ultimate implementation would then be to allow the user to specify some filter criteria and then download the corresponding files as a single ZIP file.

Since the amount of criteria is big enough, I cannot pre-generate all the possible ZIP files and must do it on-the-fly. Another problem is that the download can be quite large and for users with slow connections it's quite likely that it will take an hour or more. Support for "resume" is therefore a must-have.

On the bright side however the ZIP doesn't need to compress anything - the files are mostly JPEGs anyway. Thus the whole process shouldn't be more CPU-intensive than a simple file download.

The problems then that I have identified are thus:

  • PHP has execution timeout for scripts. While it can be changed by the script itself, will there be no problems by removing it completely?
  • With the resume option, there is the possibility of the filter results changing for different HTTP requests. This might be mitigated by sorting the results chronologically, as the collection is only getting bigger. The request URL would then also include a date when it was originally created and the script would not consider files younger than that. Will this be enough?
  • Will passing large amounts of file data through PHP not be a performance hit in itself?

How would you implement this? Is PHP up to the task at all?


Added:

By now two people have suggested to store the requested ZIP files in a temporary folder and serving them from there as usual files. While this is indeed an obvious solution, there are several practical considerations which make this infeasible.

The ZIP files will usually be pretty large, ranging from a few tens of megabytes to hundreads of megabytes. It's also completely normal for a user to request "everything", meaning that the ZIP file will be over a gigabyte in size. Also there are many possible filter combinations and many of them are likely to be selected by the users.

As a result, the ZIP files will be pretty slow to generate (due to sheer volume of data and disk speed), and will contain the whole collection many times over. I don't see how this solution would work without some mega-expensive SCSI RAID array.

解决方案

This may be what you need: http://pablotron.org/software/zipstream-php/

This lib allows you to build a dynamic streaming zip file without swapping to disk.

这篇关于使用 PHP + Apache 快速生成 ZIP 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆