在Linux上使用Apache Commons Compression压缩文件时编码错误 [英] Encoding errors when compressing files with Apache Commons Compression on Linux

查看:236
本文介绍了在Linux上使用Apache Commons Compression压缩文件时编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Apache Commons API压缩压缩文件。 Windows 7工作正常,但在Linux(ubuntu 10.10 - UTF8)中,文件名和文件夹名称(例如º)中的字符将被替换为?。



有没有任何参数我应该传递到API,当紧凑,或解压缩tar?



我使用tar.gz格式,遵循API示例。



我正在尝试压缩的文件,在Windows中创建...有什么麻烦吗?



代码:

  public class TarGzTest 
{

public static void createTarGzOfDirectory(String directoryPath ,String tarGzPath)throws IOException
{
System.out.println(Criando tar.gz da pasta+ directoryPath +em+ tarGzPath);
FileOutputStream fOut = null;
BufferedOutputStream bOut = null;
GzipCompressorOutputStream gzOut = null;
TarArchiveOutputStream tOut = null;

try
{
fOut = new FileOutputStream(new File(tarGzPath));
bOut = new BufferedOutputStream(fOut);
gzOut = new GzipCompressorOutputStream(bOut);
tOut = new TarArchiveOutputStream(gzOut);

addFileToTarGz(tOut,directoryPath,);
}
finally
{
tOut.finish();
tOut.close();
gzOut.close();
bOut.close();
fOut.close();
}
System.out.println(Processoconcluído);
}

private static void addFileToTarGz(TarArchiveOutputStream tOut,String path,String base)throws IOException
{
System.out.println(addFileToTarGz():: +路径);
文件f = new File(path);
String entryName = base + f.getName();
TarArchiveEntry tarEntry = new TarArchiveEntry(f,entryName);

tOut.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);

if(f.isFile())
{
tOut.putArchiveEntry(tarEntry);

IOUtils.copy(new FileInputStream(f),tOut);

tOut.closeArchiveEntry();
}
else
{
文件[] children = f.listFiles();

if(children!= null)
{
for(File child:children)
{
addFileToTarGz(tOut,child.getAbsolutePath() entryName +/);
}
}
}
}
}



(p)(我禁止主要的方法;)



编辑(monkeyjluffy):我所做的更改是在不同的平台上保持一致的归档。那么在它上算出的哈希是一样的。

解决方案

我发现了一个解决办法,我的麻烦。



由于某些原因,java不尊重我的环境编码,并将其更改为cp1252。



之后我解压缩文件,我只需输入该文件夹,并运行以下命令:

  convmv --notest -f cp1252 -t utf8 * -r 

并将其转换为UTF-8。



问题解决了,家伙。



有关linux中编码问题的更多信息 here



感谢大家的帮助。


I am compressing files using the Apache Commons API Compression. Windows 7 works fine, but in Linux (ubuntu 10.10 - UTF8), characters in file names and folder names, such as "º", for example, are replaced by "?".

Is there any parameter I should pass to the API when compact, or when uncompressing tar?

I'am using tar.gz format, following the API examples.

The files I'm trying compress, are created in windows... is there any trouble?

The code:

    public class TarGzTest 
    {

    public static void createTarGzOfDirectory(String directoryPath, String tarGzPath) throws IOException
    {
        System.out.println("Criando tar.gz da pasta " + directoryPath + " em " + tarGzPath);
        FileOutputStream fOut = null;
        BufferedOutputStream bOut = null;
        GzipCompressorOutputStream gzOut = null;
        TarArchiveOutputStream tOut = null;

        try
        {
            fOut = new FileOutputStream(new File(tarGzPath));
            bOut = new BufferedOutputStream(fOut);
            gzOut = new GzipCompressorOutputStream(bOut);
            tOut = new TarArchiveOutputStream(gzOut);

            addFileToTarGz(tOut, directoryPath, "");
        }
        finally
        {
            tOut.finish();
            tOut.close();
            gzOut.close();
            bOut.close();
            fOut.close();
        }
        System.out.println("Processo concluído.");
    }

    private static void addFileToTarGz(TarArchiveOutputStream tOut, String path, String base) throws IOException
    {
        System.out.println("addFileToTarGz()::"+path);
        File f = new File(path);
        String entryName = base + f.getName();
        TarArchiveEntry tarEntry = new TarArchiveEntry(f, entryName);

        tOut.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);

        if(f.isFile())
        {
            tOut.putArchiveEntry(tarEntry);

            IOUtils.copy(new FileInputStream(f), tOut);

            tOut.closeArchiveEntry();
        }
        else
        {
            File[] children = f.listFiles();

            if(children != null)
            {
                for(File child : children)
                {
                    addFileToTarGz(tOut, child.getAbsolutePath(), entryName + "/");
                }
            }
        }
    }
}

(I suppress the main method;)

EDIT (monkeyjluffy) : The changes that I made are to have always the same archive on different platform. Then the hash calculated on it is the same.

解决方案

I found a workaround for my trouble.

For some reason, java doesn't respects my environment's encoding, and change it to cp1252.

After that I uncompress the file, I just enter in it folder, and ran this command:

convmv --notest -f cp1252 -t utf8 * -r

And it converts everything recursively to UTF-8.

Problem solved, guys.

more info about encoding problems in linux here.

Thanks everyone for the help.

这篇关于在Linux上使用Apache Commons Compression压缩文件时编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆