ZipInputStream(InputStream,Charset)解码ZipEntry文件名falsely [英] ZipInputStream(InputStream, Charset) decodes ZipEntry file name falsely

查看:1929
本文介绍了ZipInputStream(InputStream,Charset)解码ZipEntry文件名falsely的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java 7应该解决使用除UTF-8之外的字符集解压zip存档的一个老问题。这可以通过构造函数 ZipInputStream(InputStream,Charset)来实现。到现在为止还挺好。在显式设置ISO-8859-1字符集时,我可以解压缩包含文件名中带有变音符号的zip存档。

Java 7 is supposed to fix an old problem with unpacking zip archives with character sets other than UTF-8. This can be achieved by constructor ZipInputStream(InputStream, Charset). So far, so good. I can unpack a zip archive containing file names with umlauts in them when explicitly setting an ISO-8859-1 character set.

问题:当使用 ZipInputStream.getNextEntry()对流进行迭代时,这些条目在其名称中具有错误的特殊字符。在我的情况下,umlautü替换为?字符,这显然是错误的。有人知道如何解决这个问题吗?显然 ZipEntry 忽略其基础 ZipInputStream Charset 。它看起来像另一个zip压缩相关的JDK错误,但我也可能做错了。

But here is the problem: When iterating over the stream using ZipInputStream.getNextEntry(), the entries have wrong special characters in their names. In my case the umlaut "ü" is replaced by a "?" character, which is obviously wrong. Does anybody know how to fix this? Obviously ZipEntry ignores the Charset of its underlying ZipInputStream. It looks like yet another zip-related JDK bug, but I might be doing something wrong as well.

...
zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("ISO-8859-1")
);
while ((zipEntry = zipStream.getNextEntry()) != null) {
    // wrong name here, something like "M?nchen" instead of "München"
    System.out.println(zipEntry.getName());
    ...
}


推荐答案

OMG,我玩了两个多小时,但只是五分钟后,我终于在这里发布了问题,我碰到的答案:我的zip文件不是用ISO-8859-1编码,但使用Cp437。因此,构造函数调用应该是:

OMG, I played around for two or so hours, but just five minutes after I finally posted the question here, I bumped into the answer: My zip file was not encoded with ISO-8859-1, but with Cp437. So the constructor call should be:

zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("Cp437")
);

现在它的作用就像一个魅力。对不起,打扰你反正。我希望这可以帮助别人面临类似的问题。

Now it works like a charm. Sorry for bothering you anyway. I hope this helps someone else facing similar problems.

这篇关于ZipInputStream(InputStream,Charset)解码ZipEntry文件名falsely的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆