html的HTTP gzip编码 [英] HTTP gzip encoding of html

查看:187
本文介绍了html的HTTP gzip编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的一个项目,我不得不编写我自己的lite网络服务器。
目前它正在做我想做的事情,但有点慢......。至少让我放慢脚步。
因此,我在考虑实施gzip压缩来加快速度。
以下是操作方法。

  public static String encodeToGZip(String data){
ByteArrayOutputStream bout = null;
尝试{
bout = new ByteArrayOutputStream();
GZIPOutputStream output = new GZIPOutputStream(bout);
output.write(data.getBytes());
output.flush();
output.close();
bout.close();
} catch(IOException ex){
ex.printStackTrace();
}

try {
return new String(bout.toByteArray(),UTF-8);
} catch(UnsupportedEncodingException ex){
return null;
}
}

问题是Web服务器无法解码我发送的数据。尽管它表明它接受gzip编码,所以我必须发送一些损坏的数据。



这是结果。
wireshark sniff ==>
GET /login.html HTTP / 1.1



主机:localhost:9090



连接:keep-alive

缓存控制:no-cache

Pragma:用户代理:Mozilla / 5.0(Macintosh;英特尔Mac OS X 10_7_3)AppleWebKit / 535.11(KHTML,如Gecko)Chrome / 17.0.963.79 Safari / 535.11


$ b 接受:text / html,application / xhtml + xml,application / xml; q = 0.9, / ; q = 0.8

接受编码:gzip,deflate,sdch

接受语言:en-US,en; q = 0.8

Accept-Charset:ISO-8859-1,utf-8; q = 0.7,*; q = 0.3



HTTP / 1.1 200 OK

连接:关闭

服务器:我的精简版服务器v0



内容编码:gzip b
$ b

内容类型:text / html

............... T ... N ... 0 ....#...... .O ...?... $ ........... BB ... ...摹6 ... [.....ü.......... 0.6 ...................... G6E ...............小号......ç.. $ ..... .....`我
Gw ............ AOAhU ... XO ... d ...] .... IU ... h ... + ...... [.....ÿ......... b ... | X ......... .........马币1 .. ... 1 ... UI .........小号... N ............˚F...... T2。[$ X .... ...,M ..... M#* ........... d .... 58HL:....蜡质......ž....... .... m ... t ... Z。)'XQdg
...... X .........〜......(..... < ....... p / .......
.............. 6 | 7 ........ 3
... r.Sv ... / ... rT .............. SrJ .......... M.vR ^ .. .4 $ ...
.q ... x ................... / ... 8 .......... .M ... Y#...,J ...... 7 ........ d..le ...; ............... ...〜...... o .... F ......

解决方案

 返回新的字符串(bout.toByteArray(),UTF-8); 

您的方法中的这一行将产生垃圾字符串。



上述构造函数执行从给定编码转换为UTF-16的转码操作。你需要大量的任意字节,并尝试将它们解码为UTF-8。您只能将UTF-8编码的字符数据解码为UTF-8。 Java没有二进制安全字符串(所有字符串都是UTF-16);您必须改用字节数组。



只需将压缩字节写入您的 OutputStream



避免使用 data.getBytes(),因为它使用默认的系统编码。这将产生不可移植的代码,因为默认的系统编码依赖于系统和配置。首选明确设置编码


For a project of mine i'm having to code my own lite webserver. At the moment it's doing what i want it to do, but kinda ... slow. at least to slow for me. Therefore i was thinking about implementing gzip compression to speed things up. Here's how.

public static String encodeToGZip(String data) {
        ByteArrayOutputStream bout = null;
        try {
            bout = new ByteArrayOutputStream();
            GZIPOutputStream output = new GZIPOutputStream(bout);
            output.write(data.getBytes());
            output.flush();
            output.close();
            bout.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }

        try {
            return new String(bout.toByteArray(), "UTF-8");
        } catch (UnsupportedEncodingException ex) {
            return null;
        }
    }

the problem is that the webserver can't decode the data i've sent. eventhough it states that it accepts gzip encoding so i must be sending some corrupt data.

this is the result. wireshark sniff==> GET /login.html HTTP/1.1

Host: localhost:9090

Connection: keep-alive

Cache-Control: no-cache

Pragma: no-cache

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.79 Safari/535.11

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8

Accept-Encoding: gzip,deflate,sdch

Accept-Language: en-US,en;q=0.8

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3


HTTP/1.1 200 OK

Connection: close

Server: My Lite Server v0

Content-Encoding: gzip

Content-Type: text/html

...............T...N...0....#.......O...?...$...........BB...g...6...[.....u...........6......................g6e...............S......c..$..........`I Gw............AOAhU...XO...d...].... IU...h...+......[.....Y.........b...|x.........rm1.........1.....L...uI.........S...n............F......T2.[$X.......M.....M.#*...........d....58HL:....Wx......Z...........m...t...Z.)'XQdg ......X.........~......(......<.......p/....... ..........."...6|7........3 ...r.Sv.../...rT...."..........SrJ..........M.vR^...4$... .q...x.................../...8...........M...y#...j......7........d..le....;..................~......o....F......

解决方案

return new String(bout.toByteArray(), "UTF-8");

This line in your method will produce garbage strings.

The above constructor performs a transcoding operation from the given encoding to UTF-16. You take a bunch of arbitrary bytes and try to decode them as UTF-8. You can only decode UTF-8 encoded character data as UTF-8. Java does not have binary-safe strings (all strings are UTF-16); you must use byte arrays instead.

Just write the compressed bytes to your OutputStream.

Avoid using data.getBytes() as it uses the default system encoding. This will produce non-portable code as the default system encoding is system and configuration dependent. Prefer always setting an encoding explicitly.

这篇关于html的HTTP gzip编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆