Base64编码数据流进行解码 [英] Stream decoding of Base64 data

查看:362
本文介绍了Base64编码数据流进行解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些大的base64 EN codeD数据(存储在瞬间文件在Hadoop文件系统)。
该数据最初gzip压缩的文本数据。
我需要能够读取这个块EN codeD数据,德code,然后将其刷新到一个GZIPOutputStream。

I have some large base64 encoded data (stored in snappy files in the hadoop filesystem). This data was originally gzipped text data. I need to be able to read chunks of this encoded data, decode it, and then flush it to a GZIPOutputStream.

这是我怎么能这样做,而不是整个的base64数据加载到一个数组,并呼吁Base64.de codeBase64任何想法(字节[])?

Any ideas on how I could do this instead of loading the whole base64 data into an array and calling Base64.decodeBase64(byte[]) ?

我说得对,如果我读字符,直到'\\ r \\ n分隔符和德code将其一行行?
例如

Am I right if I read the characters till the '\r\n' delimiter and decode it line by line? e.g. :

for (int i = 0; i < byteData.length; i++) {
    if (byteData[i] == CARRIAGE_RETURN || byteData[i] == NEWLINE) {
       if (i < byteData.length - 1 && byteData[i + 1] == NEWLINE)
            i += 2;
       else 
            i += 1;

       byteBuffer.put(Base64.decodeBase64(record));

       byteCounter = 0;
       record = new byte[8192];
    } else {
        record[byteCounter++] = byteData[i];
    }
}

可悲的是,这种方法并没有给出任何人类可读的输出。
理想情况下,我想流中读取,德code,和流出来的数据。

Sadly, this approach doesn't give any human readable output. Ideally, I would like to stream read, decode, and stream out the data.

现在,我试图把在一个InputStream,然后复制到gzipout

Right now, I'm trying to put in an inputstream and then copy to a gzipout

byteBuffer.get(bufferBytes);

InputStream inputStream = new ByteArrayInputStream(bufferBytes);
inputStream = new GZIPInputStream(inputStream);
IOUtils.copy(inputStream , gzipOutputStream);

和它给了我一个
java.io.IOException异常:腐败GZIP拖车

And it gives me a java.io.IOException: Corrupt GZIP trailer

推荐答案

让我们一步一步来:


  1. 您需要一个 GZIPInputStream 读取压缩数据(而不是 GZIPOutputStream ;输出流用于COM preSS数据)。有了这个流,你将能够读取uncom pressed,原始二进制数据。这需要一个的InputStream 在构造函数中。

  1. You need a GZIPInputStream to read zipped data (that and not a GZIPOutputStream; the output stream is used to compress data). Having this stream you will be able to read the uncompressed, original binary data. This requires an InputStream in the constructor.

您需要能够读取Base64编码的连接codeD数据的输入流。我建议得心应手<一个href=\"http://commons.apache.org/proper/commons-$c$cc/javadocs/api-release/org/apache/commons/$c$cc/binary/Base64InputStream.html\"相对=nofollow> Base64InputStream 从的 Apache的commons- codeC 。用构造可以设置线路长度,行分隔,并设置 DOEN code =假脱code数据。这就需要另一个输入流 - 原始,Base64编码的连接codeD数据

You need an input stream capable of reading the Base64 encoded data. I suggest the handy Base64InputStream from apache-commons-codec. With the constructor you can set the line length, the line separator and set doEncode=false to decode data. This in turn requires another input stream - the raw, Base64 encoded data.

这流取决于你如何让你的数据;理想情况下,数据应该作为的InputStream - 问题解决了。如果没有,你可能不得不使用 ByteArrayInputStream的(如二进制),的StringBufferInputStream (如果字符串)等。

This stream depends on how you get your data; ideally the data should be available as InputStream - problem solved. If not, you may have to use the ByteArrayInputStream (if binary), StringBufferInputStream (if string) etc.

大致是这样的逻辑是:

InputStream fromHadoop = ...;                                  // 3rd paragraph
Base64InputStream b64is =                                      // 2nd paragraph
    new Base64InputStream(fromHadoop, false, 80, "\n".getBytes("UTF-8"));
GZIPInputStream zis = new GZIPInputStream(b64is);              // 1st paragraph

请注意 Base64InputStream 的参数(线路长度和尾线的字节数组),你可能需要调整它们。

Please pay attention to the arguments of Base64InputStream (line length and end-of-line byte array), you may need to tweak them.

这篇关于Base64编码数据流进行解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆