在Java中解压缩GZIPed HTTP响应 [英] Uncompress GZIPed HTTP Response in Java

查看:522
本文介绍了在Java中解压缩GZIPed HTTP响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 GZIPInputStream 解压缩GZIPed HTTP响应。但是当我尝试读取流时,我总是有相同的异常: java.util.zip.ZipException:无效位长重复

I'm trying to uncompress a GZIPed HTTP Response by using GZIPInputStream. However I always have the same exception when I try to read the stream : java.util.zip.ZipException: invalid bit length repeat

我的HTTP请求标头:

My HTTP Request Header:

GET www.myurl.com HTTP/1.0\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7\r\n
Keep-Alive: 115\r\n
Connection: keep-alive\r\n
X-Requested-With: XMLHttpRequest\r\n
Cookie: Some Cookies\r\n\r\n

在HTTP Response标头的末尾,我得到 path = / Content-Encoding:gzip ,然后是gziped响应。

At the end of the HTTP Response header, I get path=/Content-Encoding: gzip, followed by the gziped response.

我试过2个similars代码来解压缩:

I tried 2 similars codes to uncompress :

更新:在以下代码中, tBytes =('path = / Content-Encoding:gzip之后的字符串')。getBytes();

UPDATE : In the following codes, tBytes = (the string after 'path=/Content-Encoding: gzip').getBytes ();

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

StringBuffer  szBuffer = new StringBuffer ();

byte  tByte [] = new byte [1024];

while (true)
{
    int  iLength = gzip.read (tByte, 0, 1024); // <-- Error comes here

    if (iLength < 0)
        break;

    szBuffer.append (new String (tByte, 0, iLength));
}

这是我在这个论坛上得到的:

And this one that I get on this forum :

InputStream     gzipStream = new GZIPInputStream   (new ByteArrayInputStream (tBytes));
Reader          decoder    = new InputStreamReader (gzipStream, "UTF-8");//<- I tried ISO-8859-1 and get the same exception
BufferedReader  buffered   = new BufferedReader    (decoder);

我猜这是一个编码错误。

I guess this is an encoding error.

祝你好运,

bill0ute

推荐答案

你没有显示如何获得用于设置gzip流的 tBytes

You don't show how you get the tBytes that you use to set up the gzip stream here:

GZIPInputStream  gzip = new GZIPInputStream (new ByteArrayInputStream (tBytes));

一种解释是您将整个HTTP响应包含在 tBytes 。相反,它应该只是HTTP标头之后的内容。

One explanation is that you are including the entire HTTP response in tBytes. Instead, it should be only the content after the HTTP headers.

另一种解释是响应是 chunked

编辑:您正在获取数据在内容编码行之后作为消息体。但是,根据HTTP 1.1规范,标题字段没有按任何特定顺序排列,因此这非常危险。

edit: You are taking the data after the content-encoding line as the message body. However, according to the HTTP 1.1 specification the header fields do not come in any particular order, so this is very dangerous.

HTTP规范,请求或响应的消息正文不会出现在特定的标题字段,但第一个空行后

As explained in this part of the HTTP specification, the message body of a request or response doesn't come after a particular header field but after the first empty line:


请求(第5部分)和响应
(第6节)消息使用RFC 822 [9]的通用
消息格式为
转移实体(消息的
的有效载荷)。两种类型的消息
都包含一个起始行,零个或多个
标题字段(也称为
标题),一个空行(即
行,包含CRLF之前没有任何内容)
表示标题
字段的结尾,可能还有消息正文。

Request (section 5) and Response (section 6) messages use the generic message format of RFC 822 [9] for transferring entities (the payload of the message). Both types of message consist of a start-line, zero or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body.

你仍然没有显示你如何撰写 tBytes ,但此时我认为你错误地在你试图解压缩的数据中包含空行。消息正文在空行的CRLF字符后开始。

You still haven't show how exactly you compose tBytes, but at this point I think you're erroneously including the empty line in the data that you try to decompress. The message body starts after the CRLF characters of the empty line.

我建议您使用 httpclient 库而不是提取邮件正文?

May I suggest that you use the httpclient library instead to extract the message body?

这篇关于在Java中解压缩GZIPed HTTP响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆