如何在java中解压缩不是UTF8格式的文件 [英] How to unzip file that that is not in UTF8 format in java

查看:156
本文介绍了如何在java中解压缩不是UTF8格式的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文件,例如test.zip。如果我使用像winrar这样的ZIP工具,则很容易提取(将test.zip解压缩到test.csv)。但test.csv不是UTF8格式。我的问题是,当我使用java解压缩它时,它无法读取此文件。

I have a file e.g. test.zip. If I use a ZIP-tool like winrar, it's easy to extract (unzip test.zip to test.csv). But test.csv is not in UTF8 format. My problem here is, when I use java to unzip it, it can't read this file.

ZipFile zf = new ZipFile("C:/test.zip");

抛出的异常表示通过打开该文件发生错误。

The thrown exception says that there occurs an error by opening that file.

关于java http://java.sun.com/ developer / technicalArticles / Programming / compression / 没有关于数据格式化的文章。也许整个API仅针对UTF8格式数据而设计。那么,如果我必须解压缩除UTF8格式之外的数据,如何解压缩呢?特别是拥有更多空间大小的日文和中文字符(UTF8除外)。我还在
找到了一个API http://truezip.java.net/6/tutorial .html 提到这个问题。但是,我没有找到解决方法。有没有简单的方法来解决这个问题?特别是从JAVA规范请求传递的API。

On java http://java.sun.com/developer/technicalArticles/Programming/compression/ is nothing written about data formatting. Maybe the whole API is designed only for UTF8-format data. So, if I have to unzip data except UTF8 format, how to unzip it? Especially the japanese and chinese characters that holds more space size (except UTF8). I also found an API at http://truezip.java.net/6/tutorial.html where this problem is mentioned. But, I didn't get a way on how to solve it. Is there any simple way to solve this problem? Especially from the API that is passed from JAVA specification request.

推荐答案

JDK6在java.util.zip中有一个bug实现它不能处理非USASCII字符。我使用Apache Commons commons-compress-1.0.jar库来修复它。 JDK7修复了java.util.zip的实现。
http://docs.oracle .com / javase / 7 / docs / api / java / util / zip / ZipInputStream.html

JDK6 has a bug in java.util.zip implementation it cannot handle non-USASCII characters. I use Apache Commons commons-compress-1.0.jar library to fix it. JDK7 has fixed java.util.zip implementation. http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipInputStream.html

import java.io.*;
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.zip.*;

public static int unzip(File inputZip, File outputFolder) throws IOException {
    int count=0;
    FileInputStream fis = null;
    ZipArchiveInputStream zis = null;
    FileOutputStream fos = null;
    try {
        byte[] buffer = new byte[8192];
        fis = new FileInputStream(inputZip);
        zis = new ZipArchiveInputStream(fis, "Cp1252", true); // this supports non-USACII names
        ArchiveEntry entry;
        while ((entry = zis.getNextEntry()) != null) {
            File file = new File(outputFolder, entry.getName());
            if (entry.isDirectory()) {
                file.mkdirs();
            } else {
                count++;
                file.getParentFile().mkdirs();
                fos = new FileOutputStream(file);
                int read;
                while ((read = zis.read(buffer,0,buffer.length)) != -1)
                    fos.write(buffer,0,read);
                fos.close();
                fos=null;
            }
        }
    } finally {
        try { zis.close(); } catch (Exception e) { }
        try { fis.close(); } catch (Exception e) { }
        try { if (fos!=null) fos.close(); } catch (Exception e) { }
    }
    return count;
}

这篇关于如何在java中解压缩不是UTF8格式的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆