如何在java中解压缩不是UTF8格式的文件 [英] How to unzip file that that is not in UTF8 format in java
问题描述
我有一个文件,例如test.zip。如果我使用像winrar这样的ZIP工具,则很容易提取(将test.zip解压缩到test.csv)。但test.csv不是UTF8格式。我的问题是,当我使用java解压缩它时,它无法读取此文件。
I have a file e.g. test.zip. If I use a ZIP-tool like winrar, it's easy to extract (unzip test.zip to test.csv). But test.csv is not in UTF8 format. My problem here is, when I use java to unzip it, it can't read this file.
ZipFile zf = new ZipFile("C:/test.zip");
抛出的异常表示通过打开该文件发生错误。
The thrown exception says that there occurs an error by opening that file.
关于java http://java.sun.com/ developer / technicalArticles / Programming / compression / 没有关于数据格式化的文章。也许整个API仅针对UTF8格式数据而设计。那么,如果我必须解压缩除UTF8格式之外的数据,如何解压缩呢?特别是拥有更多空间大小的日文和中文字符(UTF8除外)。我还在
找到了一个API http://truezip.java.net/6/tutorial .html 提到这个问题。但是,我没有找到解决方法。有没有简单的方法来解决这个问题?特别是从JAVA规范请求传递的API。
On java http://java.sun.com/developer/technicalArticles/Programming/compression/ is nothing written about data formatting. Maybe the whole API is designed only for UTF8-format data. So, if I have to unzip data except UTF8 format, how to unzip it? Especially the japanese and chinese characters that holds more space size (except UTF8). I also found an API at http://truezip.java.net/6/tutorial.html where this problem is mentioned. But, I didn't get a way on how to solve it. Is there any simple way to solve this problem? Especially from the API that is passed from JAVA specification request.
推荐答案
JDK6在java.util.zip中有一个bug实现它不能处理非USASCII字符。我使用Apache Commons commons-compress-1.0.jar库来修复它。 JDK7修复了java.util.zip的实现。
http://docs.oracle .com / javase / 7 / docs / api / java / util / zip / ZipInputStream.html
JDK6 has a bug in java.util.zip implementation it cannot handle non-USASCII characters. I use Apache Commons commons-compress-1.0.jar library to fix it. JDK7 has fixed java.util.zip implementation. http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipInputStream.html
import java.io.*;
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.zip.*;
public static int unzip(File inputZip, File outputFolder) throws IOException {
int count=0;
FileInputStream fis = null;
ZipArchiveInputStream zis = null;
FileOutputStream fos = null;
try {
byte[] buffer = new byte[8192];
fis = new FileInputStream(inputZip);
zis = new ZipArchiveInputStream(fis, "Cp1252", true); // this supports non-USACII names
ArchiveEntry entry;
while ((entry = zis.getNextEntry()) != null) {
File file = new File(outputFolder, entry.getName());
if (entry.isDirectory()) {
file.mkdirs();
} else {
count++;
file.getParentFile().mkdirs();
fos = new FileOutputStream(file);
int read;
while ((read = zis.read(buffer,0,buffer.length)) != -1)
fos.write(buffer,0,read);
fos.close();
fos=null;
}
}
} finally {
try { zis.close(); } catch (Exception e) { }
try { fis.close(); } catch (Exception e) { }
try { if (fos!=null) fos.close(); } catch (Exception e) { }
}
return count;
}
这篇关于如何在java中解压缩不是UTF8格式的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!