Java的解压缩实用程序性能不佳 [英] Poor Performance of Java's unzip utilities

查看:758
本文介绍了Java的解压缩实用程序性能不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到与使用WinZip等本机工具相比,Java中的解压缩工具非常慢。

I have noticed that the unzip facility in Java is extremely slow compared to using a native tool such as WinZip.

是否有可用于Java的第三方库效率更高?
首选开源。

Is there a third party library available for Java that is more efficient? Open Source is preferred.

修改

这是一个使用Java内置解决方案与7zip进行速度比较。
我在原始解决方案中添加了缓冲输入/输出流(感谢Jim,这确实产生了很大的不同)。

Here is a speed comparison using the Java built-in solution vs 7zip. I added buffered input/output streams in my original solution (thanks Jim, this did make a big difference).

Zip文件大小:800K
Java解决方案:2.7秒
7Zip解决方案:204 ms

Zip File size: 800K Java Solution: 2.7 seconds 7Zip solution: 204 ms

以下是使用内置Java解压缩的修改代码:

Here is the modified code using the built-in Java decompression:

/** Unpacks the give zip file using the built in Java facilities for unzip. */
@SuppressWarnings("unchecked")
public final static void unpack(File zipFile, File rootDir) throws IOException
{
  ZipFile zip = new ZipFile(zipFile);
  Enumeration<ZipEntry> entries = (Enumeration<ZipEntry>) zip.entries();
  while(entries.hasMoreElements()) {
    ZipEntry entry = entries.nextElement();
    java.io.File f = new java.io.File(rootDir, entry.getName());
    if (entry.isDirectory()) { // if its a directory, create it
      continue;
    }

    if (!f.exists()) {
      f.getParentFile().mkdirs();
      f.createNewFile();
    }

    BufferedInputStream bis = new BufferedInputStream(zip.getInputStream(entry)); // get the input stream
    BufferedOutputStream bos = new BufferedOutputStream(new java.io.FileOutputStream(f));
    while (bis.available() > 0) {  // write contents of 'is' to 'fos'
      bos.write(bis.read());
    }
    bos.close();
    bis.close();
  }
}


推荐答案

问题不在于解压缩,而是将解压缩的数据写回磁盘的效率低下。我的基准测试显示使用

The problem is not the unzipping, it's the inefficient way you write the unzipped data back to disk. My benchmarks show that using

    InputStream is = zip.getInputStream(entry); // get the input stream
    OutputStream os = new java.io.FileOutputStream(f);
    byte[] buf = new byte[4096];
    int r;
    while ((r = is.read(buf)) != -1) {
      os.write(buf, 0, r);
    }
    os.close();
    is.close();

反而将方法的执行时间缩短了5倍(对于6 MB,从5秒减少到1秒) zip文件)。

instead reduces the method's execution time by a factor of 5 (from 5 to 1 second for a 6 MB zip file).

可能的罪魁祸首是你使用 bis.available()。除了不正确(可用返回读取调用之前的字节数将阻塞,直到流的末尾),这会绕过BufferedInputStream提供的缓冲,需要对复制到输出文件中的每个字节进行本机系统调用。

The likely culprit is your use of bis.available(). Aside from being incorrect (available returns the number of bytes until a call to read would block, not until the end of the stream), this bypasses the buffering provided by BufferedInputStream, requiring a native system call for every byte copied into the output file.

请注意,如果您像上面一样使用批量读取和写入方法,则无需包装在BufferedStream中,并且关闭资源的代码不是异常安全的(如果因任何原因读取或写入失败, os 都将被关闭)。最后,如果你在类路径中有IOUtils,我建议使用经过良好测试的 IOUtils.copy 而不是自己滚动。

Note that wrapping in a BufferedStream is not necessary if you use the bulk read and write methods as I do above, and that the code to close the resources is not exception safe (if reading or writing fails for any reason, neither is nor os would be closed). Finally, if you have IOUtils in the class path, I recommend using their well tested IOUtils.copy instead of rolling your own.

这篇关于Java的解压缩实用程序性能不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆