使用ZipFileSystem压缩巨大的文件夹会导致OutOfMemoryError [英] Zipping a huge folder by using a ZipFileSystem results in OutOfMemoryError

查看:457
本文介绍了使用ZipFileSystem压缩巨大的文件夹会导致OutOfMemoryError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

java.nio软件包通过将zip文件视为文件系统,具有处理zip文件的优美方法.这使我们能够将zip文件内容像普通文件一样对待.因此,只需使用Files.copy将所有文件复制到zip文件中,就可以压缩整个文件夹.由于也要复制子文件夹,因此我们需要一个访问者:

The java.nio package has a beautiful way of handling zip files by treating them as file systems. This enables us to treat zip file contents like usual files. Thus, zipping a whole folder can be achieved by simply using Files.copy to copy all the files into the zip file. Since subfolders are to be copied as well, we need a visitor:

 private static class CopyFileVisitor extends SimpleFileVisitor<Path> {
    private final Path targetPath;
    private Path sourcePath = null;
    public CopyFileVisitor(Path targetPath) {
        this.targetPath = targetPath;
    }

    @Override
    public FileVisitResult preVisitDirectory(final Path dir,
    final BasicFileAttributes attrs) throws IOException {
        if (sourcePath == null) {
            sourcePath = dir;
        } else {
        Files.createDirectories(targetPath.resolve(sourcePath
                    .relativize(dir).toString()));
        }
        return FileVisitResult.CONTINUE;
    }

    @Override
    public FileVisitResult visitFile(final Path file,
    final BasicFileAttributes attrs) throws IOException {
    Files.copy(file,
        targetPath.resolve(sourcePath.relativize(file).toString()), StandardCopyOption.REPLACE_EXISTING);
    return FileVisitResult.CONTINUE;
    }
}

这是一个简单的递归复制目录"访问者.它用于递归复制目录.但是,使用ZipFileSystem,我们还可以使用它将目录复制到zip文件中,例如:

This is a simple "copy directory recursively" visitor. It is used to copy a directory recursively. However, with the ZipFileSystem, we can also use it to copy a directory into a zip file, like this:

public static void zipFolder(Path zipFile, Path sourceDir) throws ZipException, IOException
{
    // Initialize the Zip Filesystem and get its root
    Map<String, String> env = new HashMap<>();
    env.put("create", "true");
    URI uri = URI.create("jar:" + zipFile.toUri());       
    FileSystem fileSystem = FileSystems.newFileSystem(uri, env);
    Iterable<Path> roots = fileSystem.getRootDirectories();
    Path root = roots.iterator().next();

    // Simply copy the directory into the root of the zip file system
    Files.walkFileTree(sourceDir, new CopyFileVisitor(root));
}

这就是我所说的压缩整个文件夹的一种优雅方式.但是,在巨大的文件夹(大约3 GB)上使用此方法时,会收到OutOfMemoryError(堆空间).当使用常规的zip处理库时,不会引发此错误.因此,ZipFileSystem处理副本的方式似乎效率很低:太多要写入的文件都保存在内存中,因此OutOfMemoryError发生了.

This is what I call an elegant way of zipping a whole folder. However, when using this method on a huge folder (around 3 GB) I receive an OutOfMemoryError (heap space). When using a usual zip handling library, this error is not raised. Thus, it seems that the way the ZipFileSystem handles the copy is very inefficient: Too much of the files to be written is kept in memory so the OutOfMemoryError occurs.

为什么会这样?在内存消耗方面,通常使用ZipFileSystem效率低下吗?还是我在这里做错了什么?

Why is this the case? Is using ZipFileSystem generally considered inefficient (in terms of memory consumption) or am I doing something wrong here?

推荐答案

我查看了ZipFileSystem.java,并相信找到了内存消耗的来源.默认情况下,该实现使用ByteArrayOutputStream作为缓冲区来压缩文件,这意味着它受分配给JVM的内存量的限制.

I looked at ZipFileSystem.java and I believe I found the source of the memory consumption. By default, the implementation is using ByteArrayOutputStream as the buffer to compress the files, which means that it's limited by the amount of memory assigned to the JVM.

有一个(未记录)环境变量,我们可以使用它来使实现使用临时文件("useTempFile").它是这样的:

There's an (undocumented) environment variable we can use to make the implementation use temporary files ("useTempFile"). It works like this:

Map<String, Object> env = new HashMap<>();
env.put("create", "true");
env.put("useTempFile", Boolean.TRUE);

此处有更多详细信息: http://www.docjar.com /html/api/com/sun/nio/zipfs/ZipFileSystem.java.html ,有趣的行是96、1358和1362.

More details here: http://www.docjar.com/html/api/com/sun/nio/zipfs/ZipFileSystem.java.html, interesting lines are 96, 1358 and 1362.

这篇关于使用ZipFileSystem压缩巨大的文件夹会导致OutOfMemoryError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆