ZipInputStream(InputStream,Charset)解码ZipEntry文件名falsely [英] ZipInputStream(InputStream, Charset) decodes ZipEntry file name falsely
问题描述
Java 7应该解决使用除UTF-8之外的字符集解压zip存档的一个老问题。这可以通过构造函数 ZipInputStream(InputStream,Charset)
来实现。到现在为止还挺好。在显式设置ISO-8859-1字符集时,我可以解压缩包含文件名中带有变音符号的zip存档。
Java 7 is supposed to fix an old problem with unpacking zip archives with character sets other than UTF-8. This can be achieved by constructor ZipInputStream(InputStream, Charset)
. So far, so good. I can unpack a zip archive containing file names with umlauts in them when explicitly setting an ISO-8859-1 character set.
但问题:当使用 ZipInputStream.getNextEntry()
对流进行迭代时,这些条目在其名称中具有错误的特殊字符。在我的情况下,umlautü替换为?字符,这显然是错误的。有人知道如何解决这个问题吗?显然 ZipEntry
忽略其基础 ZipInputStream
的 Charset
。它看起来像另一个zip压缩相关的JDK错误,但我也可能做错了。
But here is the problem: When iterating over the stream using ZipInputStream.getNextEntry()
, the entries have wrong special characters in their names. In my case the umlaut "ü" is replaced by a "?" character, which is obviously wrong. Does anybody know how to fix this? Obviously ZipEntry
ignores the Charset
of its underlying ZipInputStream
. It looks like yet another zip-related JDK bug, but I might be doing something wrong as well.
...
zipStream = new ZipInputStream(
new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
Charset.forName("ISO-8859-1")
);
while ((zipEntry = zipStream.getNextEntry()) != null) {
// wrong name here, something like "M?nchen" instead of "München"
System.out.println(zipEntry.getName());
...
}
推荐答案
OMG,我玩了两个多小时,但只是五分钟后,我终于在这里发布了问题,我碰到的答案:我的zip文件不是用ISO-8859-1编码,但使用Cp437。因此,构造函数调用应该是:
OMG, I played around for two or so hours, but just five minutes after I finally posted the question here, I bumped into the answer: My zip file was not encoded with ISO-8859-1, but with Cp437. So the constructor call should be:
zipStream = new ZipInputStream(
new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
Charset.forName("Cp437")
);
现在它的作用就像一个魅力。对不起,打扰你反正。我希望这可以帮助别人面临类似的问题。
Now it works like a charm. Sorry for bothering you anyway. I hope this helps someone else facing similar problems.
这篇关于ZipInputStream(InputStream,Charset)解码ZipEntry文件名falsely的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!