从Ant tar任务打包的.tar.gz中提取时包含非拉丁字符的文件名的编码 [英] Encoding of filenames containing non-latin characters while extracting from .tar.gz packed by Ant tar task

查看:65
本文介绍了从Ant tar任务打包的.tar.gz中提取时包含非拉丁字符的文件名的编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Ant构建tar.gz存档:

 < tar destfile ="$ {linux86.zip.file}" compression ="gzip" longfile ="gnu">< tarfileset dir ="$ {work.dir}/data" dirmode ="755" filemode ="755"prefix ="$ {app.folder}/data"/></tar> 

存档是在Windows上构建的.在Ubuntu 12上解压了名称包含非拉丁字符(例如西里尔字母)的文件后,文件名便损坏了.

有什么办法可以解决或解决此问题?

解决方案

我在Ant的开发人员邮件列表( 36851 1350857 中.tar中有一个名称为条目名称的编码名称的构造函数:

 公共TarOutputStream(OutputStream os,字符串编码){...} 

但是,它从未在Tar任务中使用.因此,我在Tar任务中创建了一个编码属性,从修改后的源中重建了Ant,并使用UTF-8作为条目名的编码.

提取已在Ubuntu 11/12和Mandriva下进行测试.

I'm building a tar.gz archive using Ant:

<tar destfile="${linux86.zip.file}" compression="gzip" longfile="gnu">
    <tarfileset dir="${work.dir}/data" dirmode="755" filemode="755"  
                prefix="${app.folder}/data"/>
</tar>

Archive is built on Windows. After being extracted on Ubuntu 12 files with names containing non-latin (for example, cyrillic) characters have broken names.

Is there any way to fix or work around that?

解决方案

I have found some interesting information in Ant's developer mailing list (30 Jun 2009, 01 Jul 2009) and in ASF Bugzilla (36851, 53811). The problem is old and well-known, it has not been fixed mainly for ideological reasons because not all untar implementations support that.

Patch mentioned in Bugzilla issue has been applied in revision 1350857. There is a constructor with name of encoding for entry name in tar:

public TarOutputStream(OutputStream os, String encoding) { ... }

But it is never used in Tar task though. So I made an encoding attribute in Tar task, rebuilt Ant from modified sources and used UTF-8 as encoding of entry names.

Extraction tested under Ubuntu 11/12 and Mandriva.

这篇关于从Ant tar任务打包的.tar.gz中提取时包含非拉丁字符的文件名的编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆