使用Java创建包含国家字符的tar存档 [英] Creating tar archive with national characters in Java
问题描述
您是否知道Java中的某些库/方式可以在适当的Windows国家代码页(例如cp1250)中生成带有文件名的tar存档。
Do you know some library/way in Java to generate tar archive with file names in proper windows national codepage ( for example cp1250 ).
我试过 Java tar ,示例代码:
final TarEntry entry = new TarEntry( files[i] );
String filename = files[i].getPath().replaceAll( baseDir, "" );
entry.setName( new String( filename.getBytes(), "Cp1250" ) );
out.putNextEntry( entry );
...
它不起作用。在Windows中提取tar时,国家字符被破坏了。
我也发现了一件奇怪的事情,在Linux下波兰国家字符只有在我使用ISO-8859-1时才能正确显示:
It doesn't work. National characters are broken where I extract tar in windows. I've also found a strange thing, under Linux Polish national characters are shown correctly only when I used ISO-8859-1:
entry.setName( new String( filename.getBytes(), "ISO-8859-1" ) );
尽管正确的波兰语代码页是ISO-8859-2,但这也不起作用。
我也试过Windows的Cp852,没效果。
Despite the fact that proper Polish codepage is ISO-8859-2, which doesn't work too. I've also tried Cp852 for windows, no effect.
我知道tar格式的限制,但更改它不是一种选择。
I know the limitations of tar format, but changing it is not an option.
感谢您的建议,
推荐答案
正式地,TAR不支持标头中的非ASCII。但是,我能够在Linux上使用UTF-8编码的文件名。
Officially, TAR doesn't support non-ASCII in headers. However, I was able to use UTF-8 encoded filenames on Linux.
你应该试试这个,
String filename = files[i].getName();
byte[] bytes = filename.getBytes("Cp1250")
entry.setName(new String(bytes, "ISO-8859-1"));
out.putNextEntry( entry );
这至少保留了TAR标题中Cp1250中的字节。
This at least preserves the bytes in Cp1250 in TAR headers.
这篇关于使用Java创建包含国家字符的tar存档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!