解释HTTP / 1.1标头字段值时使用的编码 [英] What encoding to use when interpreting HTTP/1.1 header field value

查看:165
本文介绍了解释HTTP / 1.1标头字段值时使用的编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


message-header = field-name:在HTTP / 1.1规范中, [field-value]


[...]


field-value = *(field-content | LWS)



field-contet = <构成字段值的OCTET, * TEXT或令牌,分隔符和引号字符串>


的组合,OCTET和TEXT的定义为:


OCTET =<任何8位数据序列>



TEXT =< any OCTET,除了CTL,但包括LWS> ;其中CTL指的是来自US-ASCII字符集的控制字符。


问题:头文件名称(在定义中称为字段名称),所使用的编码是US-ASCII(在HTTP / 1.1规范中指定),但服务器应用程序如何知道对头值使用什么编码? / p>

注意:我认为这是正常的US-ASCII编码,但定义让足够的空间不同的情况。

解决方案

非ASCII码点的语义本质上是未定义的。避免他们。



收件人通常使用ISO-8859-1解码,至少允许稍后恢复(因为它将保留所有八位字节)。



(另外:你正在查看错误的规范; RFC 2616已被RFC 7230废弃)


In HTTP/1.1 specs I get this when it comes to define headers:

message-header = field-name ":" [ field-value ]

[...]

field-value = *( field-content | LWS )

field-contet = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string>

and the definition for OCTET and TEXT is:

OCTET = <any 8-bit sequence of data>

TEXT = <any OCTET except CTLs, but including LWS> ; where CTL refers to control characters from US-ASCII charset.

Question: Now, when it comes to header names (called field-names in definition), the encoding used is US-ASCII (specified in HTTP/1.1 specs), but how would a server application know what encoding to use for header values?

Note: I think it's normal to be US-ASCII encoded, but the definition lets enough room for different situation.

解决方案

The semantics of non-ASCII code points is essentially undefined. Avoid them.

Recipients usually decode using ISO-8859-1, which at least allows recovery later on (because it'll preserve all octets).

(Also: you're looking at the wrong spec; RFC 2616 is obsoleted by RFC 7230)

这篇关于解释HTTP / 1.1标头字段值时使用的编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆