在java中读取一个巨大的Zip文件 - 内存不足错误 [英] Reading a huge Zip file in java - Out of Memory Error

查看：76 发布时间：2021/10/5 19:10:33 java zip

本文介绍了在java中读取一个巨大的Zip文件 - 内存不足错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 java 读取 ZIP 文件，如下所示:

I am reading a ZIP file using java as below:

Enumeration<? extends ZipEntry> zes=zip.entries();
    while(zes.hasMoreElements()) {
        ZipEntry ze=zes.nextElement();
        // do stuff..
    }

我遇到内存不足错误，zip 文件大小约为 160MB.堆栈跟踪如下:

I am getting an out of memory error, the zip file size is about 160MB. The stacktrace is as below:

Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
at java.util.zip.InflaterInputStream.<init>(InflaterInputStream.java:88)
at java.util.zip.ZipFile$1.<init>(ZipFile.java:229)
at java.util.zip.ZipFile.getInputStream(ZipFile.java:229)
at java.util.zip.ZipFile.getInputStream(ZipFile.java:197)
at com.aesthete.csmart.batches.batchproc.DatToInsertDBBatch.zipFilePass2(DatToInsertDBBatch.java:250)
at com.aesthete.csmart.batches.batchproc.DatToInsertDBBatch.processCompany(DatToInsertDBBatch.java:206)
at com.aesthete.csmart.batches.batchproc.DatToInsertDBBatch.run(DatToInsertDBBatch.java:114)
at java.util.TimerThread.mainLoop(Timer.java:534)
at java.util.TimerThread.run(Timer.java:484)

如何在不增加堆大小的情况下枚举大 zip 文件的内容?此外，当我不枚举内容而只访问这样的单个文件时:

How do I enumerate the contents of a big zip file without having increase my heap size? Also when I dont enumerate the contents and just access a single file like this:

ZipFile zip=new ZipFile(zipFile);
ZipEntry ze=zip.getEntry("docxml.xml");

然后我没有得到内存不足的错误.为什么会发生这种情况?Zip 文件如何处理 zip 条目?另一种选择是使用 ZIPInputStream.那会不会有一个小的内存占用.我最终需要在 Amazon 云(613 MB RAM)上的微型 EC2 实例上运行此代码

Then I dont get an out of memory error. Why does this happen? How does a Zip file handle zip entries? The other option would be to use a ZIPInputStream. Would that have a small memory footprint. I would need to run this code eventually on a micro EC2 instance on the Amazon cloud (613 MB RAM)

提供有关我在获取 zip 条目后如何处理它们的更多信息

Enumeration<? extends ZipEntry> zes=zip.entries();
    while(zes.hasMoreElements()) {
        ZipEntry ze=zes.nextElement();
        S3Object s3Object=new S3Object(bkp.getCompanyFolder()+map.get(ze.getName()).getRelativeLoc());
            s3Object.setDataInputStream(zip.getInputStream(ze));
            s3Object.setStorageClass(S3Object.STORAGE_CLASS_REDUCED_REDUNDANCY);
            s3Object.addMetadata("x-amz-server-side-encryption", "AES256");
            s3Object.setContentType(Mimetypes.getInstance().getMimetype(s3Object.getKey()));
            s3Object.setContentDisposition("attachment; filename="+FilenameUtils.getName(s3Object.getKey()));
            s3objs.add(s3Object);
    }

我从 zipentry 获取 zipinputstream 并将其存储在 S3object 中.我收集列表中的所有 S3Object，然后最后将它们上传到 Amazon S3.对于那些不了解 Amazon S3 的人，它是一种文件存储服务.您通过 HTTP 上传文件.

I get the zipinputstream from the zipentry and store that in the S3object. I collect all the S3Objects in a list and then finally upload them to Amazon S3. For those who dont know Amazon S3, its a file storage service. You upload the file via HTTP.

我在想，也许是因为我收集了所有单独的输入流，所以正在发生这种情况?如果我把它分批会有帮助吗?像一次 100 个输入流?还是先解压，然后用解压后的文件上传而不是存储流会更好?

在java中读取一个巨大的Zip文件 - 内存不足错误 [英] Reading a huge Zip file in java - Out of Memory Error

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在java中读取一个巨大的Zip文件 - 内存不足错误 [英] Reading a huge Zip file in java - Out of Memory Error

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭