HTTP客户端应如何正确解析“块式" HTTP响应主体? [英] How should an HTTP client properly parse *chunked* HTTP response body?

查看:181
本文介绍了HTTP客户端应如何正确解析“块式" HTTP响应主体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用分块HTTP传输编码时,为什么服务器需要写出两个 字节大小的字节,并且的后续块数据以CRLF结尾? /p>

这不是使发送二进制数据"CRLF-unclean"和方法有点多余吗?

如果数据在某处有0x0A,然后是0x0D(即这些实际上是数据的一部分),该怎么办?那么,客户端是否应该遵循在数据块顶部显式提供的数据块大小,或者在数据中遇到的第一个CRLF上cho住它?

到目前为止,我对预期的客户端行为的理解是简单地获取服务器提供的块大小,前进到下一行,然后从以下数据(CRLF或其中没有CRLF)中精确读取此字节数量,然后跳过紧随数据之后的CRLF并重复该过程,直到没有更多块为止.这是合规的行为吗?如果是这样,那么每个数据块之后CRLF的意义是什么?可读性?

我已经对此进行了一些Web搜索,还阅读了HTTP 1.1规范,但是似乎无法确定的答案.

解决方案

分块的使用者不扫描消息正文以查找CRLF对.它首先读取指定数量的字节,然后 再读取两个字节以确认它们是CR和LF.如果不是,则消息正文格式不正确,或者大小指定不正确,否则数据将被破坏.

尾随的CRLF是一项束手无策的保证(根据 RFC 2616第3.6.1节分块传输编码),但它也用于维护字段从行首开始的一致规则.

When chunked HTTP transfer encoding is used, why does the server need to write out both the chunk size in bytes and have the subsequent chunk data end with CRLF?

Doesn't this make sending binary data "CRLF-unclean" and the method a bit redundant?

What if the data has a 0x0A followed by 0x0D in it somewhere (i.e. these are actually part of the data)? Is the client then expected to adhere to the chunk size explicitly provided at the head of the chunk or choke on the first CRLF it encounters in the data?

My understanding so far of expected client behaviour is to simply take the chunk size provided by the server, proceed to the next line, then read exactly this amount of bytes from within the following data (CRLF or no CRLF therein), then skip the CRLF following the data and repeat the procedure until no more chunks. Is this compliant behaviour? If so, what is the point of the CRLF after each datachunk then? Readability?

I have done some Web searching on this and also did some reading of the HTTP 1.1 specification, but a definitive answer seems to be eluding me.

解决方案

A chunked consumer does not scan the message body for a CRLF pair. It first reads the specified number of bytes, and then reads two more bytes to confirm that they are CR and LF. If they're not, the message body is ill-formed, and either the size was specified improperly or the data was otherwise corrupted.

The trailing CRLF is a belt-and-suspenders assurance (per RFC 2616 section 3.6.1, Chunked Transfer Coding), but it also serves to maintain the consistent rule that fields start at the beginning of the line.

这篇关于HTTP客户端应如何正确解析“块式" HTTP响应主体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆