在 Linux 上使用 Apache Commons Compression 压缩文件时出现编码错误 [英] Encoding errors when compressing files with Apache Commons Compression on Linux

查看:25
本文介绍了在 Linux 上使用 Apache Commons Compression 压缩文件时出现编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Apache Commons API 压缩来压缩文件.Windows 7 工作正常,但在 Linux (ubuntu 10.10 - UTF8) 中,文件名和文件夹名中的字符,例如º",被替换为?".

I am compressing files using the Apache Commons API Compression. Windows 7 works fine, but in Linux (ubuntu 10.10 - UTF8), characters in file names and folder names, such as "º", for example, are replaced by "?".

在压缩或解压缩 tar 时,我应该向 API 传递任何参数吗?

Is there any parameter I should pass to the API when compact, or when uncompressing tar?

我正在使用 tar.gz 格式,遵循 API 示例.

I'am using tar.gz format, following the API examples.

我正在尝试压缩的文件是在 Windows 中创建的...有什么问题吗?

The files I'm trying compress, are created in windows... is there any trouble?

代码:

    public class TarGzTest 
    {

    public static void createTarGzOfDirectory(String directoryPath, String tarGzPath) throws IOException
    {
        System.out.println("Criando tar.gz da pasta " + directoryPath + " em " + tarGzPath);
        FileOutputStream fOut = null;
        BufferedOutputStream bOut = null;
        GzipCompressorOutputStream gzOut = null;
        TarArchiveOutputStream tOut = null;

        try
        {
            fOut = new FileOutputStream(new File(tarGzPath));
            bOut = new BufferedOutputStream(fOut);
            gzOut = new GzipCompressorOutputStream(bOut);
            tOut = new TarArchiveOutputStream(gzOut);

            addFileToTarGz(tOut, directoryPath, "");
        }
        finally
        {
            tOut.finish();
            tOut.close();
            gzOut.close();
            bOut.close();
            fOut.close();
        }
        System.out.println("Processo concluído.");
    }

    private static void addFileToTarGz(TarArchiveOutputStream tOut, String path, String base) throws IOException
    {
        System.out.println("addFileToTarGz()::"+path);
        File f = new File(path);
        String entryName = base + f.getName();
        TarArchiveEntry tarEntry = new TarArchiveEntry(f, entryName);

        tOut.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);

        if(f.isFile())
        {
            tOut.putArchiveEntry(tarEntry);

            IOUtils.copy(new FileInputStream(f), tOut);

            tOut.closeArchiveEntry();
        }
        else
        {
            File[] children = f.listFiles();

            if(children != null)
            {
                for(File child : children)
                {
                    addFileToTarGz(tOut, child.getAbsolutePath(), entryName + "/");
                }
            }
        }
    }
}

(我抑制了 main 方法;)

(I suppress the main method;)

编辑(monkeyjluffy):我所做的更改是在不同平台上始终具有相同的存档.那么对它计算的hash也是一样的.

EDIT (monkeyjluffy) : The changes that I made are to have always the same archive on different platform. Then the hash calculated on it is the same.

推荐答案

我找到了解决问题的方法.

I found a workaround for my trouble.

出于某种原因,java 不尊重我的环境的编码,并将其更改为 cp1252.

For some reason, java doesn't respects my environment's encoding, and change it to cp1252.

之后我解压文件,我只是进入它的文件夹,然后运行这个命令:

After that I uncompress the file, I just enter in it folder, and ran this command:

convmv --notest -f cp1252 -t utf8 * -r

它将所有内容递归地转换为 UTF-8.

And it converts everything recursively to UTF-8.

问题解决了,伙计们.

有关 linux 中编码问题的更多信息 这里.

more info about encoding problems in linux here.

感谢大家的帮助.

这篇关于在 Linux 上使用 Apache Commons Compression 压缩文件时出现编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆