Java zip字符编码 [英] Java zip character encoding

查看:231
本文介绍了Java zip字符编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  import java.util.zip。 CRC32; 
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public static void doZip(final File inputfis,final File outputfis)throws IOException {

FileInputStream fis = null;
FileOutputStream fos = null;

final CRC32 crc = new CRC32();
crc.reset();

try {
fis = new FileInputStream(inputfis);
fos = new FileOutputStream(outputfis);
final ZipOutputStream zos = new ZipOutputStream(fos);
zos.setLevel(6);
final ZipEntry ze = new ZipEntry(inputfis.getName());
zos.putNextEntry(ze);
final int BUFSIZ = 8192;
final byte inbuf [] = new byte [BUFSIZ];
int n;
while((n = fis.read(inbuf))!= -1){
zos.write(inbuf,0,n);
crc.update(inbuf);
}
ze.setCrc(crc.getValue());
zos.finish();
zos.close();
} catch(final IOException e){
throw e;
} finally {
if(fis!= null){
fis.close();
}
if(fos!= null){
fos.close();
}
}
}

我的问题是我有具有内容 N°TICKET 的平面文本文件,例如,压缩结果会在未压缩时提供一些被删除的字符 N°TICKET 。还不支持éà等字符。



我想这是由于字符编码,但是我不知道如何将其在zip方法中设置为 ISO-8859-1



(我正在Windows 7,java 6上运行)

解决方案

正确写入它们给出的字节的流。作家解释字符数据并将其转换为相应的字节,读者做相反的操作。 Java(至少在版本6中)不提供一种简单的方法来混合和匹配压缩数据和写入字符的操作。



这样就可以正常工作。但是,这有点笨拙。

  File inputFile = new File(utf-8-data.txt); 
文件outputFile =新文件(latin-1-data.zip);

ZipEntry entry = new ZipEntry(latin-1-data.txt);

BufferedReader reader = new BufferedReader(new FileReader(inputFile));

ZipOutputStream zipStream = new ZipOutputStream(new FileOutputStream(outputFile));
BufferedWriter writer = new BufferedWriter(
new OutputStreamWriter(zipStream,Charset.forName(ISO-8859-1))
);

zipStream.putNextEntry(entry);

//这是重要的部分:
//所有字符数据都是通过写入器编写的,而不是zip输出流
String line = null;
while((line = reader.readLine())!= null){
writer.append(line).append('\\\
');
}
writer.flush(); //我使用了一个缓冲的作者,所以确保冲洗到
//底层的zip输出流

zipStream.closeEntry();
zipStream.finish();

reader.close();
writer.close();


I'm using the following method to compress a file into a zip file:

import java.util.zip.CRC32;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public static void doZip(final File inputfis, final File outputfis) throws IOException {

    FileInputStream fis = null;
    FileOutputStream fos = null;

    final CRC32 crc = new CRC32();
    crc.reset();

    try {
        fis = new FileInputStream(inputfis);
        fos = new FileOutputStream(outputfis);
        final ZipOutputStream zos = new ZipOutputStream(fos);
        zos.setLevel(6);
        final ZipEntry ze = new ZipEntry(inputfis.getName());
        zos.putNextEntry(ze);
        final int BUFSIZ = 8192;
        final byte inbuf[] = new byte[BUFSIZ];
        int n;
        while ((n = fis.read(inbuf)) != -1) {
            zos.write(inbuf, 0, n);
            crc.update(inbuf);
        }
        ze.setCrc(crc.getValue());
        zos.finish();
        zos.close();
    } catch (final IOException e) {
        throw e;
    } finally {
        if (fis != null) {
            fis.close();
        }
        if (fos != null) {
            fos.close();
        }
    }
}

My problem is that i have flat text files with the content N°TICKET for example, the zipped result gives some weired characters when uncompressed N° TICKET. Also characters such as é and à are not supported.

I guess it's due to the character encoding, but I don't know how to set it in my zip method to ISO-8859-1 ?

(I'm running on windows 7, java 6)

解决方案

You are using streams which write exactly the bytes that they are given. Writers interpret character data and convert it to the corresponding bytes and Readers do the opposite. Java (at least in version 6) doesn't provide an easy way to to mix and match operations on zipped data and for writing characters.

This way will work though. It is, however, a little clunky.

File inputFile = new File("utf-8-data.txt");
File outputFile = new File("latin-1-data.zip");

ZipEntry entry = new ZipEntry("latin-1-data.txt");

BufferedReader reader = new BufferedReader(new FileReader(inputFile));

ZipOutputStream zipStream = new ZipOutputStream(new FileOutputStream(outputFile));
BufferedWriter writer = new BufferedWriter(
    new OutputStreamWriter(zipStream, Charset.forName("ISO-8859-1"))
);

zipStream.putNextEntry(entry);

// this is the important part:
// all character data is written via the writer and not the zip output stream
String line = null;
while ((line = reader.readLine()) != null) {
    writer.append(line).append('\n');
}
writer.flush(); // i've used a buffered writer, so make sure to flush to the
// underlying zip output stream

zipStream.closeEntry();
zipStream.finish();

reader.close(); 
writer.close();

这篇关于Java zip字符编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆